web servers & load balancing techniques
DESCRIPTION
Web Servers & Load Balancing Techniques. 3/20/2001 송준화 김영호. Part I : Web Servers. Overview. What is a web server? Market share How a web server works? How does a web server serve contents? Architectures of Web Servers Example : Apache, AOLServer, Jigsaw Issues on Web Servers - PowerPoint PPT PresentationTRANSCRIPT
3/20/2001 Network Computing Laboratory EE. KAIST
1
Web Servers & Load Web Servers & Load Balancing TechniquesBalancing Techniques
3/20/2001송준화김영호
3/20/2001 Network Computing Laboratory EE. KAIST
2
Part I : Web ServersPart I : Web Servers
3/20/2001 Network Computing Laboratory EE. KAIST
3
OverviewOverview
What is a web server? Market share How a web server works? How does a web server serve contents? Architectures of Web Servers
– Example : Apache, AOLServer, Jigsaw Issues on Web Servers Load Balancing Techniques (part 2) References
3/20/2001 Network Computing Laboratory EE. KAIST
4
What is a Web Server?What is a Web Server?
An advanced application which runs on a server and does the following– Provides connections to remote computers – Sends web pages to remote computers via the Inte
rnet or an Intranet Examples of Web Servers
– Apache– MS Internet Information Server for Windows NT – AOLServer
3/20/2001 Network Computing Laboratory EE. KAIST
5
Market ShareMarket Share
3/20/2001 Network Computing Laboratory EE. KAIST
6
How does a Web Server How does a Web Server Work?Work? Static Contents
1. Web server receives a request for a Web page such as http://www.kaist.ac.kr/index.html
2. Server maps URL to a local file on the host server.
3. The server then loads this file from disk and serves it out across the network to the user's Web browser.
3/20/2001 Network Computing Laboratory EE. KAIST
7
3/20/2001 Network Computing Laboratory EE. KAIST
8
Dynamic contents– Dynamic means Web pages created in response to
a user’s input (eg : CGI)– Web server should run programs locally and trans
mit their output through the Web server to the user's Web browser that is requesting the dynamic content
– user's Web browser never really has to know that the content is dynamic because CGI is basically a Web server extension protocol.
3/20/2001 Network Computing Laboratory EE. KAIST
9
3/20/2001 Network Computing Laboratory EE. KAIST
10
How does a web server serve How does a web server serve contents?contents?
The primary mechanism for deciding how to display content is the MIME type header.
Multipurpose Internet Mail Extension (MIME) types tell a Web browser what sort of document is being sent.
3/20/2001 Network Computing Laboratory EE. KAIST
11
More than 370 MIME types are distributed with the Apache Web server by default in the mime.types configuration file.
eg) Apache mime.types file. – text/xml xml– video/mpeg mpeg mpg mpe– video/quicktime qt mov
3/20/2001 Network Computing Laboratory EE. KAIST
12
Reception
Request
Analysis
Access
Control
Resource
Handler
Record
Transaction
UtilityOperating System Abstraction Layer
Browser
Operating System
Web Server
3/20/2001 Network Computing Laboratory EE. KAIST
13
Architecture of Web Architecture of Web ServerServer Reception
– Interprets the resource request protocol
– Parses the requests, and builds an internal representation of the request
– Determines capabilities of the browser (e.g., simple text browser or graphics capable browser)
3/20/2001 Network Computing Laboratory EE. KAIST
14
Request Analyzer– Translates the location of the resource from netwo
rk location to local file name • eg) ~/index.html could be transformed to local file /usr/h
ttpd/pub/index.html
Access Control– Enforces the access rules employed by the server– Authenticate the browser and authorizes their acce
ss to the requested resources
February 19, 2001 PC Data Online
3/20/2001 Network Computing Laboratory EE. KAIST
15
Resource Handler
– Determines the type of the resource requested by the browser, executes it and generates the response.
Record Transaction– Records all the requests and their result.
Support Layer– Utility and Operating System Abstraction
Layer– Provide functions used by the above
subsystems
3/20/2001 Network Computing Laboratory EE. KAIST
16
Utility subsystem– Contains functions that are used by all
other subsystems. – It has functions for manipulating
strings or URLs and many commonly used functions
Operating System Abstraction Layer– Encapsulates the OS specific functionality to
facilitate the porting of the server to different platforms
3/20/2001 Network Computing Laboratory EE. KAIST
17
Example (1): ApacheExample (1): Apache
Freely Available – Source code– binaries for many platforms (version 1.3.x includes
also the Windows NT) Web server originally based on NCSA server (i
n 1995) Over 60% of Internet Web servers run Apache
or an Apache derivative (in the December 2000 survey)
3/20/2001 Network Computing Laboratory EE. KAIST
18
Process based
– 2.0 will support multi threads Very configurable, lots of directives... Optional modules provide extra functionality Apache is “A PAtCHy server”
– Patches on NCSA Httpd 1.3 Powerful performance and Continually upgrade
3/20/2001 Network Computing Laboratory EE. KAIST
19
Translatio
n
core
ResponseMime typeAuthorizatio
n
Authentication
Logging
Util OS Layer
Res. HandlerAccess
Ctrl
Req. analysis
Record
Trans.
Recep.
3/20/2001 Network Computing Laboratory EE. KAIST
20
ApacheApache
Core: maintains multiple processesRequest_rec: internal representation
3/20/2001 Network Computing Laboratory EE. KAIST
21
Example(2): AOLServerExample(2): AOLServer Commercial Web Server
– Developed by AOL– Source opened in 1999
First released in 1995 Powerful support for Database Provide extensibility
– By using a maintainable and safe extension language
– Using TCL (Tool Command Language) as the extension language.
3/20/2001 Network Computing Laboratory EE. KAIST
22
Communication
Driver
Daemon
Core
NSLog NSPerm
URL Handle
Timer Util
Database
Interface
TCL
Interpreter
NSthread
Recep.
Req. analysis
Access
Ctrl
Res. Handler
Record
Trans.
Util OSAL
*(NS: NaviSoft)
platform independent
Thread Lib.
3/20/2001 Network Computing Laboratory EE. KAIST
23
AOLServerAOLServer Richer OSAL and Utility subsystems (than Apache)
– Portable thread lib. Implementation– Database interface– Timer
• Event scheduling, time-out of connections, etc– TCL interpreter
Support for multiple network protocols Internal structure: Conn
3/20/2001 Network Computing Laboratory EE. KAIST
24
JigsawJigsaw
Experimental server developed by W3C– Analyzing Internet protocols and
standards Open source, first released in 1996 Written in Java
– Platform independent– OSAL does not exist – Extensibility – Object Oriented design
3/20/2001 Network Computing Laboratory EE. KAIST
25
Daemon
Protocol
Frame
In Filter
Resource
In Filter
Protocol
Frame
Out Filter
Resource
Out Filter
Resource
Util
Access
Ctrl
Record
Trans.
Res. Handler
Recep.
3/20/2001 Network Computing Laboratory EE. KAIST
26
JigsawJigsaw
Daemon: maintains a thread pool for concurrency
Filters: for different experiments???
3/20/2001 Network Computing Laboratory EE. KAIST
27
Issues on Web ServerIssues on Web Server
Connections explosion– Due to rapid growth of WWW
application on the internet, a web server may encounter the situation that a huge number of connection requests in a very short time
Research trend on web server– Load Balancing– Distributed Scalable Web Server
3/20/2001 Network Computing Laboratory EE. KAIST
28
Part II : Part II : Load Balancing Techniques Load Balancing Techniques
Junehwa SongYoung Ho Kim
3/20/2001 Network Computing Laboratory EE. KAIST
29
Load Balancing TechniqueLoad Balancing Technique
Mirror Client based approach DNS-based approach Dispatcher based approach
• Packet Single Rewriting• Packet Double Rewriting• Network Dispatcher
Server based approach• HTTP redirection• Packet redirection
3/20/2001 Network Computing Laboratory EE. KAIST
30
MirrorMirror
Replicate information across a mirrored server architecture
User manually select alternative URL
Not user transparentDon’t allow the Web-server
system to control request distribution
3/20/2001 Network Computing Laboratory EE. KAIST
31
Client Based ApproachClient Based Approach
Web Client– Web client selects a node of the cluster and submi
ts the request to the selected node– Netscape home(http://www.netscape.com) use thi
s technique• When user access this site, Navigator selects a random n
umber i between 1 and the number of servers and directs the request to the node wwwi.netscape.com
– Limited practical applicability and is not scalable
3/20/2001 Network Computing Laboratory EE. KAIST
32
Smart Client– Migrates server functionality to
the client through a Java applet– Increase network traffic and
network delayClient side Proxies
– Web Cluster standpoint, proxy servers are similar to clients
3/20/2001 Network Computing Laboratory EE. KAIST
33
DNS Based ApproachDNS Based Approach
DNS server maps the domain name to multiple IP address
Returning more than one IP address for the hostname or returning a different IP address for each DNS request it receives (Round robin)
User transparentSimple and easy to implement
3/20/2001 Network Computing Laboratory EE. KAIST
34
3/20/2001 Network Computing Laboratory EE. KAIST
35
3/20/2001 Network Computing Laboratory EE. KAIST
36
Drawbacks
– Unable to know the situation of the whole system
– Not really fair because DNS uses a simple round robin
– DNS may encounter TTL problem in IP-address cache
• Between the client and the web server DNS, many intermediate name servers can cache the logical name to IP address mapping to reduce network traffic and every web browser typically caches some address resolution
3/20/2001 Network Computing Laboratory EE. KAIST
37
•Because of address caching, each
address can cause a burst of future requests to the selected server and quickly obsolete the current load information
– Many DNS based solutions to this problem•System-Stateless algorithms•Server-State-based algorithms•Client-State-based algorithms•Adaptive TTL Algorithms
3/20/2001 Network Computing Laboratory EE. KAIST
38
Dispatcher based Dispatcher based approachapproach
3/20/2001 Network Computing Laboratory EE. KAIST
39
To centralize request scheduling and
completely control client-request routing Request routing among server is
transparent-unlike DNS-based– DNS deals address at the URL level, the dispatcher
has a single, virtual IP address(IP-SVA)
Dispatcher uniquely identifies each server in the system through a private address
Dispatcher typically use simple algorithms to select the Web server
3/20/2001 Network Computing Laboratory EE. KAIST
40
Packet Single RewritingPacket Single Rewriting
3/20/2001 Network Computing Laboratory EE. KAIST
41
TCP router acts as an IP address dispatcher
– Router tracks the source IP address for every established TCP connection to route packets regarding the same connection to the same web server node
High System availability– When one of server fails, its address can be remov
ed from the router’s table – Can be combined with a DNS based solution
3/20/2001 Network Computing Laboratory EE. KAIST
42
Packet Double RewritingPacket Double Rewriting
3/20/2001 Network Computing Laboratory EE. KAIST
43
Two solution using this approach– Magicrouter – Cisco System’s Local Director
Because outgoing packets typically outnumber incoming request packets, dispatcher becomes bottleneck
3/20/2001 Network Computing Laboratory EE. KAIST
44
Network DispatcherNetwork Dispatcher
Extends the basic TCP router mechanism work with both LANs and WANs
Dispatcher forward packets to the selected server using its physical address without IP modification
3/20/2001 Network Computing Laboratory EE. KAIST
45
Core and Sore Lab NRL project– http://core.kaist.ac.kr/nrlintro2.htm
3/20/2001 Network Computing Laboratory EE. KAIST
46
Server based approachServer based approach
Use two level dispatching mechanism– Integrating the DNS based approach
with redirection techniques executed by Web server
– Solves most DNS scheduling problemTwo Solution
– HTTP redirection– Packet redirection
3/20/2001 Network Computing Laboratory EE. KAIST
47
HTTP RedirectionHTTP Redirection
3/20/2001 Network Computing Laboratory EE. KAIST
48
Above figure server1 redirect the
request to server2. Not client transparent !
Overhead of infra cluster communication – Every server must periodically
transmit status information to cluster DNS
Increases response time in client side, because of packet redirection
3/20/2001 Network Computing Laboratory EE. KAIST
49
Packet RedirectionPacket Redirection
Use a round robin DNS mechanism to schedule the request among the Web Server
Server reached by a request reroutes the connection to another server through a packet rewriting– Transparent to the client!
Packet rewriting overhead
3/20/2001 Network Computing Laboratory EE. KAIST
50
ReferenceReference
[1] A reference architecture for Web Server Reverse Engineering, 2000. Proceedings. Seventh Working Conference on , 2
000 , Page(s): 150 -159 [2] Dynamic load balancing on Web-server systems
Cardellini, V.; Colajanni, M.; Yu, P.S. IEEE Internet Computing Volume: 3 3 , May-June 1999 , Page(s): 28 -39
[3] Design and practice of a dispatch server architecture Hong, H.C.; Chen, Y.C. Distributed Computing Systems, 1999. Proceedings. 7th IEEE Workshop on Future Trends of , 1999 , Page(s): 246 -251
[4] Scalable Web server architectures Mourad, A.; Huiqun Liu Computers and Communications, 1997. Proceedings., Second IEEE Symposium on , 1997 , Page(s): 12 -16
3/20/2001 Network Computing Laboratory EE. KAIST
51
[5] TCP/IP Illustrated, Volume1 W. Richard Strevens Addison Wesley
[6] TCP/IP Illustrated, Volume3 W. Richard Strevens Addison Wesley
[7] Netcraft. The Netcraft WWW server survey Available at http://www.netcraft.co.uk/Survey
3/20/2001 Network Computing Laboratory EE. KAIST
52