pid 1956679

7/28/2019 Pid 1956679

http://slidepdf.com/reader/full/pid-1956679 1/7

Mini Web Server Clusters for HTTP Request Splitting

Bharat S. Rawal Ramesh K. Karne Alexander L. Wijesinha

Department of Computer and Department of Computer and Department of Computer and

Information Sciences Information Sciences Information Sciences Towson University Towson University Towson University

Towson, Maryland, USA Towson, Maryland, USA Towson, Maryland, USA

[email protected] [email protected] [email protected]

Abstract— HTTP request splitting is a new concept

where the TCP connection and data transfer phases are

dynamically split between servers without using a

central dispatcher or load balancer. Splitting is

completely transparent to the client and provides

security due to the inaccessibility and invisibility of the

data servers. We study the performance of mini Web

server clusters with request splitting. With partialdelegation in which some requests are split, throughput

is better, and response times are only marginally less

than for an equivalent non-split system. For example

with partial delegation, for a four-node cluster with a

single connection server and three data servers serving

64 KB files, and for a three-node cluster with two

connection servers and a single data server serving 4 KB

files, the respective throughput improvements over non-

split systems are 10% and 22%, with only a marginal

increase in response time. In practice, the throughput

improvement percentages will be higher and response

time gaps will be lower since we ignore the overhead of a

dispatcher or load balancer in non-split systems.Although these experiments used bare PC Web servers

without an operating system/kernel for ease of

implementation, splitting and clustering may also be

implemented on conventional systems.

Keywords -Splitting HTTP Requests, Performance, Cluster

Computing, Web Servers, Bare Machine Computing.

I. INTRODUCTION

Load balancing is frequently used to enable Web serversto dynamically share the workload. For load balancing, awide variety of clustering server techniques [20, 21, 22] areemployed. Most load balancing systems used in practice

require a central control system such as a load balancingswitch or dispatcher [20]. Load balancing can beimplemented at various layers in the protocol stack [19, 20].This paper considers a new approach to load balancing thatinvolves splitting HTTP requests among a set of servers,where one or more connection servers (CSs) handle TCPconnections and may delegate a fraction (or all) requests toone or more data servers (DSs) that serve the data [23]. For example, the data transfer of a large file could be assigned to

a DS and the data transfer of a small file could be handled bythe CS itself.

One advantage of splitting is that splitting systems are

completely autonomous and do not require a central control

system such as dispatcher or load balancer. Another

advantage is that no client involvement is necessary as in

migratory or M-TCP [13]. In [23], splitting using a single

CS and a DS was shown to improve performance comparedto non-split systems. Since the DSs are completely

anonymous and invisible (they use the IP address of the

delegating CS), it would be harder for attackers to access

them. In particular, communication between DSs and clients

is only one-way, and DSs can be configured to only respond

to inter-server packets from an authenticated CS. We study

the performance of three different configurations of Web

server clusters based on HTTP splitting by measuring the

throughput (in requests/sec), and the connection and

response times at the client.

In real world applications, some servers may be close to

data sources, and some servers may be close to clients.

Splitting a client’s HTTP request and the underlying TCP

connection in this manner allows servers to dynamically

balance the workload. We have tested the splitting concept

in a LAN that consists of multiple subnets connected by

routers. In HTTP splitting, clients can be located anywhere

on the Internet. However, there are security and other issues

that arise when deploying mini clusters in an Internet where

a CS and a DS are on different networks [23].

The remainder of the paper is organized as follows.

Section II presents related work; Section III describes

splitting and the different cluster configurations used in this

study; Section IV presents results of experiments; Section V

discusses impacts of splitting; and Section VI contains the

conclusion.

II. RELATED WORK

We implemented HTTP request splitting on a bare PC

with no kernel or OS running on the machine. Bare PC

applications use the Bare Machine Computing (BMC) or

dispersed OS computing paradigm [7], wherein self-

supporting applications run on a bare PC. That is, there is no

operating system (OS) or centralized kernel running in the

machine. Instead, the application is written in C++ and runs

7/28/2019 Pid 1956679


as an application object (AO) [9] by using i

to the hardware [8] and device drivers.

concept resembles approaches that redu

and/or use lean kernels such as Exokerne

[12], Palacio [10], Libra [1], factored OS

Linux [14], and TinyOS [18], there

differences such as the lack of a centr

manages system resources and the absen

TCP/IP protocol stack. In essence, the A

the CPU and memory, and contains lean

necessary protocols. Protocol intertwinin

cross-layer design. Further details on bare

and bare machine computing (BMC) can be

Splitting an HTTP request by splittin

TCP connection is different from

connections, processes or Web sessions

connections; or masking failures in TCP-b

example, in migratory TCP (M-TCP)

connection is migrated between serv

involvement; in process migration [11]

process is transferred between machines;session handoff [2], a proxy is used to migr

in a mobile environment; in TCP splicing [

TCP connections are established for each

fault-tolerant TCP (FT-TCP) [16], a

continues after a failure enabling a repli

survive. Per our knowledge, no wo

connections at a protocol level has been do

Figure 1. Split architecture

III. CLUSTER CONFIGURA

Figure 1 illustrates generic request sshows the messages exchanged by the in

and TCP protocols. Connection est

termination are performed by one or

servers (CSs), and data transfer is done b

servers (DSs). When a request is split, the

request to a CS, the CS sends an inter-serve

and the DS sends the data packets to the cl

packets may also be sent during the data

ts own interfaces

hile the BMC

e OS overhead

l [3, 4], IO-Lite

[15], bare-metal

are significant

lized code that

e of a standard

itself manages

versions of the

g is a form of

PC applications

found in [5, 6].

the underlying

migrating TCP

; splicing TCP

ased servers. For

[13], a TCP

rs with client

, an executing

in proxy-basedate Web sessions

19], two separate

request; and in

CP connection

cated service to

k on splitting

e before.

IONS litting [23] andtertwined HTTP

blishment and

ore connection

one more data

client sends the

r packet to a DS,

ient. Inter-server

ransfer phase to

update the DS if retransmissions

delegation, the CS delegates a fr

DSs. With full delegation, the CS

to DSs. The design and implemen

splitting are provided in [23].

We consider mini Web server c

or more servers with protocol s

cluster performance by measuri

connection and response times o

configurations with a varying numb

Configuration 1 in Fig. 2a shows

CS, one DS, and a set of clients se

The DS and CS have different I

sends data to a client using the IP a

Configuration 2 in Fig. 2b shows

more DSs in the system with part

partial delegation mode, clients

request clients (NSRCs) send requ

requests are processed completely

connections between the NSRCs a

dotted lines. With full delegation, crequest clients (SRCs) make reque

requests are delegated to DSs. For

no NSRCs in the system. When r

DSs, we assume that they are equal

DSs in round-robin fashion. It is

other distribution strategies.

Figure 2a. Split architectur

Configuration 3 in Fig. 2c sho

with both SRCs and NSRCs. For th

small file sizes to avoid overl

Although we have not done so, mulas in Configuration 3.

IV. EXPERIMENTA

A. Experimental Setup

The experimental setup invol

cluster consisting of Dell Optiple

Pentium 4, 2.8GHz Processors, 1G

NIC on the motherboard. All syst

Linksys 16 port 1 Gbps Ethernet s

re needed. With partial

ction of its requests to

delegates all its requests

ation details of protocol

usters consisting of two

litting. We then study

g the throughput and

f three different server

er of CSs and DSs.

full delegation with one

ding requests to the CS.

addresses, but the DS

dress of the CS.

a single CS with two or

ial or full delegation. In

designated as non-split

sts to the CS, and these

by the CS as usual. The

d the CSs are shown as

ients designated as split-sts to the CS, and these

full delegation, there are

equests are delegated to

ly distributed among the

also possible to employ

configuration 1

s two CSs and one DS

is configuration, we used

oading the single DS.

tiple DSs could be added

L RESULTS

ed a prototype server

GX260 PCs with Intel

B RAM, and an Intel 1G

ms were connected to a

itch. Linux clients were

7/28/2019 Pid 1956679


used to run the http_load stress tool [17],

bare PC Web clients capable of

requests/sec was used to increase the wor

http_load limit of 1000 concurrent HTTP

client. The split server cluster was also tes

Explorer browsers running on Windo

browsers running on Linux.

Figure 2b. Split architecture config

Figure 2c. Split architecture config

B. Configuration 1 (1 CS, 1 DS, full delegat

In [23], the performance of HTTP

Configuration 1 was evaluated using vario

32 KB. Here, we study the performance o

by varying the file size up to 128 KB an

throughput in requests/sec. Fig. 3 shows th

experiments. It can be seen that the per

configuration is worse than that of a two

system for all file sizes. This is bec

overloaded resulting in performance degra

the CS is underutilized since it is only han

establishment and termination. For a two

system, we show the theoretical maxim

and in addition,

enerating 5700

load beyond the

requests/sec per

ted with Internet

s and Firefox

ration 2

ration 3

ion)

splitting with

s file sizes up to

Configuration 1

d measuring the

e results of these

ormance of this

server non-split

use the DS is

ation. However,

dling connection

server non-split

m performance

(throughput) as being double tha

system, which was determined ex

requests/sec. In practice, this theo

systems will not be attained due

balancers and dispatchers.

Figure 3. Throughput with i

(Configuratio

Fig. 4 shows the CPU utilizatio

Configuration 1. The DS’s CPU u

is close to the maximum, indicatin

cannot handle more than 1500 req

insight into the performance limi

determined the impact of connecti

the client due to increasing the req

shown in Fig. 5. The response tim

of requests increases starting at 1largest at 1500 requests/sec as

suggest that performance may be i

DSs and utilizing the remaining cap

Figure 4. CPU utilization with

(Configuratio

of a single (non-split)

erimentally to be 6000

etical limit for non-split

to the overhead of load

creasing file sizes

1)

n for the CS and DS in

ilization for 64 KB files

g that this configuration

uests/sec. To get further

ations in this case, we

on and response time at

uest rate. The results are

degrades as the number

300 requests/sec and isexpected. These results

proved by adding more

acity of the CS.

increasing file sizes

1)

7/28/2019 Pid 1956679


7/28/2019 Pid 1956679


KB files. As with full delegation, response

delegation is poor with only a single D

response time improves dramatically for s

two or three DSs and partial delegation.

Figure 8. CPU Utilization

(Configuration 2, file size 64

Figure 9. Throughput with full/partial

(Configuration 2, file size 64

Figure 10. Throughput with full/partial

varying file sizes (Configuratio

time with partial

. However, the

lit systems with

B)

delegation

B)

elegation for

n 2)

Figure 11. Connection an

(Configuration 2, file

E. Configuration 3 (2 CS, 1 DS, pa

As before, requests are generate

NSRCs. For this configuration, 4

larger file sizes will overload th

shows the throughput for three ser

delegation. Configuration 3 achie

improvement over three non-s

delegation; with partial delegati

improvement in throughput com

servers.

Fig. 13 shows the connection

Configuration 3. As expected, res

the single DS gets saturated with t

be supported with two CSs.

response times improve significa

capacity is used to handle requests

delegation. While the connection a

Configuration 3 are worse than f disadvantage should be weighed

throughput, cost, and security b

system. Also, non-split servers

response and connection times due

a dispatcher or load balancer.

Figure 12. Throughput with fu

(Configuration 3, file

response times

size 64KB)

tial delegation)

d by a set of SRCs and

B files were used since

single DS. Figure 12

ers with full and partial

ves a 6.5% throughput

lit servers with full

n, it achieves a 22%

ared to three non-split

and response times for

onse time is poor since

he high request rate that

ith partial delegation,

ntly as the unused CS

from the NSRCs without

nd response times using

r non-split servers, thisagainst the increased

nefits of using a split

ill incur a reduction in

to the overhead of using

ll/partial delegation

size 4KB)

7/28/2019 Pid 1956679


Figure 13. Connection times and respon

full/partial delegation (Configuration 3,

V. IMPACTS OF SPLITT

Splitting is a general approach that ca

principle to any application protocol that

also be applied to protocols other thanfunctionality of a protocol across machine

In particular, splitting the HTTP protocol h

in the area of load balancing. We discus

impacts below.

• Split protocol configurations can be

better response and connection times,

scalable performance. Splitting also eli

for (and overhead/cost associated wit

balancers such as a dispatcher or a spec

• Split protocol Configuration 2 (with o

and partial delegation achieves 25%

than two homogeneous servers workin

This performance gain can be utilized t

capacity while reducing the number o

in a cluster.

• Split server architectures could be use

load based on file sizes, proximity to

security considerations.

• The results obtained in this paper

machines and workloads) indicate t

sizes are in the single digits. More rese

validate this hypothesis for othe

However, if we assume that mini

contain a very small number of node

easier to maintain and manage (co

clusters). Using mini-clusters, it is p

large clusters by simply increasing theclusters.

• Splitting protocols is a new approa

server clusters for load balanci

demonstrated splitting and bui

configurations using bare PC server

general technique of splitting also app

clusters provided additional OS overhe

se times with

ile size 4KB)

NG n be applied in

ses TCP (it can

CP to split thes or processors).

as many impacts

s some of these

used to achieve

while providing

minates the need

h) external load

ial switch.

ne CS, one DS)

ore performance

g independently.

o increase server

f servers needed

to distribute the

file locations, or

(using specific

hat mini-cluster

arch is needed to

traffic loads.

-clusters should

, they would be

pared to larger

ossible to build

number of mini-

ch to designing

ng. We have

lt mini-cluster

s. However, the

lies to OS-based

ad can be kept to

a minimum (and that undue

needed to tweak the kernel to i

• When protocol splitting uses t

dramatically simplifies the l

server (each server only hand

HTTP protocols unlike a conv

does both protocols completel

less complex and hence have i

(i.e., are less likely to fail).

• Splitting can also be used to s

and “data” parts of any pro

connection-oriented protocol

connection servers can simply

data servers can provide data. I

the functionality of any appli

application) so that different

needed by it are done on

processors. Thus, a variet

applications can be split in thi

will spawn new ways of doing

The configurations studied and t

paper can be viewed as a firs

applicability of splitting as a gene

would be of interest to investigate

protocols and applications.

VI. CONCL

We studied the performance of

with HTTP request splitting, wh

central load balancer or dispatc

transparent to the client. Throughp

and response times with full an

requests were measured for a vari

system with one CS and three

delegation, can be used to a

connection times and throughput cl

theoretical limit of non-split system

configuration with partial deleg

throughput increase for 64 KB file

split servers and response times th

The same configuration with f

response times for 64 KB files by

equivalent non-split system, while

theoretical throughput. For a split s

one DS and partial delegation,

throughput over three non-split ser

files with response times that aresplit system.

We also discussed the impa

evaluating the tradeoffs of splitting

necessary to consider the over

balancers and dispatchers, whi

throughput and worse response ti

optimum values we have used fo

experimental results appear to in

developer effort is not

plement splitting).

o servers (CS and DS) it

gic and code in each

les part of the TCP and

entional Web server that

). Thus, the servers are

herently more reliability

eparate the “connection”

ocol (for example, any

like TCP). In general,

perform connections and

t can also be used to split

ation-layer protocol (or

parts of the processing

different machines or

of servers or Web

s manner. This approach

computing on the Web.

e results obtained in this

t step to validate the

ral concept. In future, it

its applicability to other

SION ini Web server clusters

ich does not require a

er, and is completely

ut as well as connection

d partial delegation of

ety of file sizes. A split

Ss, and full or partial

hieve response times,

ose to, or better than, the

s. For example, this

ation achieves a 10%

s compared to four non-

at are only slightly less.

ll delegation improves

a factor of 2.6 over the

achieving 92.5% of its

ystem with two CSs and

22% improvement in

ers is obtained for 4 KB

close to those of a non-

ts of splitting. When

versus non-splitting, it is

ead and cost of load

h will result in less

es than the theoretical

non-split systems. The

icate that scalable Web

7/28/2019 Pid 1956679


server clusters can be built using one or more split server

systems, each consisting of 3-4 servers. The performance of

split servers depends on the requested file sizes, and it is

beneficial to handle small file sizes at the CS and larger files

with partial delegation to DSs. It would be useful to study

performance of split server systems in which resource files

of different sizes are allocated to different servers to

optimize performance. More studies are also needed to

evaluate the security benefits of split server clusters, and

their scalability and performance with a variety of

workloads. While these experiments used bare PC Web

servers with no OS or kernel for ease of implementation,

HTTP requests splitting can also be implemented in

principle on conventional systems with an OS.

REFERENCES

[1] G. Ammons, J. Appayoo, M. Butrico, D. Silva, D. Grove, K.Kawachiva, O. Krieger, B. Rosenburg, E. Hensbergen, R.W.

isniewski, “Libra: A Library Operating System for a JVM in aVirtualized Execution Environment,” VEE ’07: Proceedings

of the 3rd International Conference on Virtual ExecutionEnvironments, June 2007.

[2] G. Canfora, G. Di Santo, G. Venturi, E. Zimeo and M.V.Zito,“Migrating web application sessions in mobile computing,”Proceedings of the 14th International Conference on theWorld Wide Web, 2005, pp. 1166-1167.

[3] D. R. Engler and M.F. Kaashoek, “Exterminate all operatingsystem abstractions,” Fifth Workshop on Hot Topics in

operating Systems, USENIX, Orcas Island, WA, May 1995, p. 78.

[4] G. R. Ganger, D. R. Engler, M. F. Kaashoek, H. M. Briceno,R. Hunt and T. Pinckney, “Fast and flexible application-level networking on exokernel system,” ACM Transactionson Computer Systems (TOCS), Volume 20, Issue 1, pp. 49 – 83, February, 2002.

[5] L. He, R. K. Karne, and A. L. Wijesinha, “The Design andPerformance of a Bare PC Web Server,” International Journalof Computers and Their Applications, IJCA, Vol. 15, No. 2,June 2008, pp. 100-112.

[6] L. He, R.K. Karne, A.L Wijesinha, and A. Emdadi, “A Studyof Bare PC Web Server Performance for Workloads withDynamic and Static Content,” The 11th IEEE InternationalConference on High Performance Computing and

Communications (HPCC-09), Seoul, Korea, June 2009, pp.494-499.

[7] R. K. Karne, K. V. Jaganathan, and T. Ahmed, “DOSC:Dispersed Operating System Computing,” OOPSLA ’05, 20th

Annual ACM Conference on Object OrientedProgramming,Systems, Languages, and Applications, Onward

Track, ACM,San Diego, CA, October 2005, pp. 55-61.[8] R. K. Karne, K. V. Jaganathan, and T. Ahmed, “How to run

C++ Applications on a bare PC,” SNPD 2005, Proceedings of

NPD 2005, 6th ACIS International Conference, IEEE, May2005, pp. 50-55.

[9] R. K. Karne, “Application-oriented Object Architecture: A

Revolutionary Approach,” 6th International Conference, HPCAsia 2002 (Poster), Centre for Development of AdvancedComputing, Bangalore, Karnataka, India, December 2002.

[10] J. Lange, K. Pedretti, T. Hudson, P. Dinda, Z. Cui, L. Xia, P.

Bridges, A. Gocke, S. Jaconette, M. Levenhagen, R.Brightwell, “Palacios and Kitten: New High PerformanceOperating Systems for Scalable Virtualized and NativeSupercomputing,” Proceedings of the 24th IEEE International

Parallel and Distributed Processing Symposium (IPDPS2010), April, 2010.

[11] D.S. Milojicic, F. Douglis, Y. Paindaveine, R. Wheeler and S.

Zhou. “Process Migration,” ACM Computing Surveys, Vol.32, Issue 3, September 2000, pp. 241-299.

[12] V. S. Pai, P. Druschel, and Zwaenepoel. “IO-Lite: A UnifiedI/O Buffering and Caching System,” ACM Transactions onComputer Systems, Vol.18 (1), ACM, Feb. 2000, pp. 37-66.

[13] K. Sultan, D. Srinivasan, D. Iyer and L. lftod. “MigratoryTCP: Highly Available Internet Services using ConnectionMigration,” Proceedings of the 22nd International Conference

on Distributed Computing Systems, July 2002.[14] T. Venton, M. Miller, R. Kalla, and A. Blanchard, “A Linux-

based tool for hardware bring up, Linux development, andmanufacturing,” IBM Systems J., Vol. 44 (2), IBM, NY, 2005,

pp. 319-330.

[15] D. Wentzlaff and A. Agarwal, “Factored operating systems(fos): the case for a scalable operating system for multicores,”ACM SIGOPS Operating Systems Review, Volume 43, Issue2, pp. 76-85, April 2009.

[16] D. Zagorodnov, K. Marzullo, L. Alvisi and T.C. Bressourd,“Practical and low overhead masking of failures of TCP-

based servers,” ACM Transactions on Computer Systems,Volume 27, Issue 2, Article 4, May 2009.

[17] http://www.acme.com/software/http_load.

[18] http://www.tinyos.net/.

[19] A. Cohen, S. Rangarajan, and H. Slye, “On the performanceof TCP splicing for URL-Aware redirection,” Proceedings of USITS’99, The 2nd USENIX Symposium on Internet

Technologies & Systems, October 1999.[20] Y. Jiao and W. Wang, “Design and implementation of load

balancing of a distributed-system-based Web server,” 3rd International Symposium on Electronic Commerce and

Security (ISECS), pp. 337-342, July 2010.[21] Ciardo, G., A. Riska and E. Smirni. EquiLoad: A Load

Balancing Policy for Clustered Web Servers". Performance Evaluation, 46(2-3):101-124, 2001.

[22]i Sujit Vaidya and Kenneth J.Chritensen , “A Single SystemImage Server Cluster using Duplicated MAC and IP

Addresses,” Proceedings of the 26th Annual IEEE conferenceon Local Computer Network (LCN’01)

[23] B. Rawal, R. Karne, and A. L. Wijesinha. Splitting HTTPRequests on Two Servers, The Third International Conferenceon Communication Systems and Networks: COMPSNETS2011, January 2011, Bangalore, India.

pid 1956679

Documents