Implementing
Web Server Load Balancing, Failover, and State with Squid
Tom Northcutt
Web servers perform a multitude of functions beyond the original
design of a Hypertext document distribution mechanism that enables
the distribution of documents with "...links...rather like in the
old computer game 'adventure'" (Berners-Lee, 1989). Web servers
now provide an abstract application-level communication layer among
networked systems (Web Services/SOAP), a transportation mechanism
for XML-aware clients (RSS), and the primary platform for the latest
emerging technologies in mass-media communications (Web logs and
podcasting). Often these servers sit on top of complex software
systems and databases that require considerable system resources.
As these software systems become more complex and the databases
for which they serve data continue to grow, Internet usage also
increases. Some Web sites are straining under the load.
For years I have used Squid Proxy Cache as a partial solution
to this complex problem (http://www.squid-cache.org). Using
Squid's reverse proxy feature, I have been able to significantly
accelerate my Web servers without having to scale. This gave me
both an enhancement in performance and control when performing system
maintenance. A surrogate system could be used when upgrading the
primary Web server by manipulating the Squid proxy server. As our
project grew, our primary Web server was not able to scale with
the popularity of our site and the complexity of the software to
which we had just migrated (a Jakarta Tomcat/Oracle solution). Modifications
to the Squid reverse proxy setup enabled me to load balance the
Web site across five servers in either round-robin or random schemes.
A Web application that could normally accommodate no more than
five simultaneous users on one server now scales to more than a
hundred simultaneous users with proxy caching and load balancing.
Additional scripts check for node failures and modify the load balancing
configuration to remove down servers providing failover capabilities.
The Web developers required that certain applications maintain server
state so that interactive programs would function properly and search
algorithm caching would be enhanced. This feature was added to the
load balancing scripts so that certain applications will maintain
Web server node state throughout the session.
Squid Proxy Cache
The most common use of Squid is as a standard Web proxy service.
This allows network managers, Internet service providers, small
companies, or facilities to provide a proxy service for external
Web sites. Squid adds a caching capability where Web pages are cached
while they are proxied; thus, subsequent users can retrieve the
cached page much faster because it is stored on the local area network
(Figure 1). Squid's caching is robust, efficient, and well documented.
Squid reverse proxy mode, also known as accelerated mode, utilizes
the same caching mechanism but with an alternate purpose. Here,
only one Web site is being proxied and cached (Figure 2). This reduces
the load on the Web server by allowing the caching proxy to return
Web pages that have already been stored in cache, often directly
from memory. This can result in significantly reduced response times
compared to accessing the Web server directly (Smith & Northcutt,
2000). By using a reverse caching proxy in front of the primary
Web server, additional administration tasks can also be enhanced
with this architecture. A surrogate Web server can be used in place
of the primary Web server, allowing the administrator to perform
routine server maintenance without incurring any Web site down time.
The proxy is simply pointed at the surrogate during the maintenance
period by manipulating the httpd_accel_host variable in the Squid
configuration file.
Load Balancing
While reverse proxy mode has demonstrated significant benefits
over running a standalone Web server in our environment, this solution
is not the most scalable or fault tolerant. Web server load balancing,
the technique of distributing the Web site load across several Web
server nodes under the proxy, provides a more scalable environment
where additional nodes can be added as the load increases. In standard
proxy mode, Squid provides an architecture where several Squid proxy
servers can establish hierarchical relationships in which cache
data can be shared and requests can be passed to proxy servers under
less load. For accelerated or reverse proxy mode, the mode we are
interested in, there are several approaches. Source code patches
are available that implement load balancing directly in the Squid
proxy server code. In this case, a Squid configuration file variable
is set where the hostnames of the nodes in the scheme are included.
Another common method of implementing load balancing in accelerated
mode uses external scripts that are called by the Squid server.
An external script can be specified by the redirect_program variable
in the Squid configuration file. To implement failover and state,
the method described here uses the external redirect script approach.
Installation
Many Unix, Linux, and BSD distributions or third-party package
distributors already supply a precompiled package for Squid that
can be easily installed using package management utilities such
as yum, up2date, inst, dpkg, or pkg_add. If you want to generate
proxy logs in a format compatible with Apache's combined LogFormat
for enhanced log analysis, you can obtain the Squid source code
from:
http://www.squid-cache.org/Versions/v2/2.5/
and apply the custom-log-2_5 patch, which is available from:
http://devel.squid-cache.org/
Once Squid is installed, modify the etc/squid.conf file to configure
the server and enable the redirect script. The location of the squid.conf
will vary depending on the installation method you used. From source,
it will be under $PREFIX/etc/; from package, it is often under /etc/squid/.
Be sure to set the following parameters according to your environment:
http_port 80
cache_dir ufs /cache 100 16 256
# include this logformat if generating apache combined logs
# and patch has been applied
logformat combined %>a %ui %un [%tl] "%rm %ru HTTP/%rv" %Hs %<st \
"%{Referer}>h" "%{User-Agent}>h" %Ss:%Sh
access_log /var/log/squid/access.log squid
cache_log /var/log/squid/cache.log
cache_store_log /var/log/squid/store.log
emulate_httpd_log on
#key to our load balancing architecture
redirect_program /usr/local/squid/bin/lb_state.pl
redirect_children 5
redirect_rewrites_hosts_header off
http_access allow all
cache_effective_user squid
cache_effective_group squid
#set your primary load balance node here
httpd_accel_host server1.localdomain
httpd_accel_port 80
strip_query_terms off
coredump_dir /cache
You can test reverse proxy mode by commenting out the redirect options,
starting Squid, and accessing it with a Web browser. You should now
be proxying your httpd_accel_host through your Squid proxy server.
Test this by accessing your proxy server in a Web browser; it should
appear as if you are accessing your http_accell_host directly. Re-enable
the redirect options in the configuration file for the following sections.
Setting up the Load Balance Architecture
Load balancing using the Squid proxy requires two or more Web
server nodes to incorporate into the load balancing architecture.
Technically, just using an accelerated proxy server and a primary
Web server, some degree of load balancing is achieved between the
cache and the primary Web server. In our scalable load balancing
architecture, multiple nodes will be proxied (Figure 3).
Server1, server2, server3, and server4 represent four nodes on
our localdomain that we want to balance under the proxy. The proxy
should be configured to answer for the IP address and hostname of
the Web site you want to serve (either directly or with DNS aliases).
In this example, servers 1-4 are on a private network and the proxy
is multi-homed, but they can also be on the public network, preferably
behind a firewall, as long as the proxy can access them. Setting
up a comprehensive /etc/hosts file on all systems, which includes
the host names and IP addresses of the proxy server and nodes, is
recommended.
The key to the load balancing architecture is the redirect_program
script. When in accelerated proxy mode, Squid passes URLs to this
external program, where the URL string can be processed and modified
before returning it to the parent Squid process. Listing 1 is the
lb_state.pl script used to implement load balancing under Squid.
This script performs many functions besides load balancing, including
keeping application state with respect to the originating node and
forcing certain URLs to be statically redirected to a specific node.
Be sure to modify the server names and URL strings in the script
according to your needs.
Static Redirection
Section 1 of the main loop of the lb_state.pl script allows the
administrator to force certain URL patterns to a specific node in
the load balance architecture:
##section 1
#check for URLs that should always be directed to a
#specific node (server1)
if ($line =~ /some.cgi/ || $line =~ /MyWebApp/ || \
$line =~ /~joes_home/ || $line =~ /appDirectory/ ){
print $line;
print F "--static goto server1 only \n-> $line" if $debug;
next;
# a different specific node (server3)
}elsif($line =~ /another.cgi/ || $line =~ /otherWebApp/ || \
$line =~ /~toms_home/ || $line =~ /srvDirectory/ ){
$line =~ s/server1/server3/;
print $line;
print F "--static goto server3 only \n-> $line" if $debug;
next;
The first block specifies that any URL coming from the user that contains
the string some.cgi, MyWebApp, ~joes_home, or appDirectory should
always be forced to the default server (i.e., server1 in this case).
The second block forces all URLs containing another.cgi, otherWebApp,
~toms_home, or srvDirectory to server3. Most load balancing architectures
will have identical Web architectures under each node, but this approach
can be useful if you have applications that you only want to install
on one system and those applications need to keep state with the client
(e.g., interactive database sessions where ingest occurs).
Keeping State
A more sophisticated mechanism for keeping state between the client
and node server is implemented in section 2:
##section 2
#check for stateful pages
}elsif($line =~ /lbnode/){
#this can be done more elegantly, but I like split
my ($null,$first) = split(/lbnode=/,$line);
my ($second, $null) = split(/&/,$first);
my ($node, $null) = split(/ /,$second);
chomp $node;
chomp $node;
print F "--found a lbnode tag, node = $node\n" if $debug;
my $j = $server_name{"$node"} || "server1";
print F " --node=$node j=$j --\n" if $debug;
$line =~ s[^http://server1.localdomain] [http://$j.localdomain];
print $line;
print F "-> static lbnode node=$node j=$j $line" if $debug;
next;
This section requires the coordination of the software developers
who produce the backend code on the Web servers that generate the
URLs. By adding a "&lbnode=" tag to the end of the URL, they can
use this variable to specify the load balance node from which this
URL is coming and to which subsequent requests should return. A hostname
call is an easy tool for them to use to obtain the node name but the
variable does not necessarily need to contain the hostname. A Perl
hash can be established in the script where a key representing each
server is created in order to map to the correct server variable.
Developers will need to use an alternate mechanism to specify the
key in the URL.
Because the lbnode data is passed in by a URL, it cannot be trusted.
By checking the $node value in a predefined %server_name hash, the
script sets the value to the default server, "server1", if a key/value
pair does not exist in the hash. For additional input validation,
check the length of the $node variable or utilize the perl substr()
function to ensure the input is the proper length string.
Round Robin
The actual load balancing occurs in section 3 of the script:
##section 3
#round robin
}else{
my $k = $server_name{"$server"} || "server1";
chomp $k;
$line =~ s[^http://server1.localdomain] [http://$k.localdomain];
print $line ;
print F "--round robin, server = $k \n-> $line" if $debug;
}
close F if $debug;
}
This last section gets called if static pages are not matched or if
an lbnode state tag is not detected. The $server variable is iterated
with each URL that is passed into the running script. When the script
reaches section 3, it will rewrite the URL specifying the proxy that
is next in the @servers array.
Round-robin load balancing might not be the most effective solution
for your environment. Randomizing the server selection from the
@server array is an alternate approach. In environments where sustained
load on servers are experienced, a component could be developed
where nodes are polled with a remote uptime call, and less heavily
loaded servers are moved higher in the server queue.
Node Failover
If one of the nodes in the load balance architecture fails, only
three of the four servers will be able to provide responses in our
example architecture. If a user is browsing the Web site at the
time, that user may experience a Web site failure. Depending on
where the user is in the Web tree, that user may:
1. Receive a response to three out of four Web page requests.
2. Always receive a response -- if he is viewing a static or stateful
URL on an up server.
3. Never receive a response -- if he is viewing a static or stateful
URL on a down server.
4. Receive a mixed response -- if the Web page he is viewing contains
multiple elements, such as images, every fourth element will not
be visible.
The solution to this is failover. The term failover is used loosely
here. Typically, it is used to describe when one server that is
idle picks up for a server that becomes unavailable. Here it is
used to describe one server in the load balanced architecture becoming
unavailable and the other nodes taking over for it. This is done
by manipulating the content of the $serverlist file in the lb_state.pl
script. A monitoring script (Listing 2) set up in cron can be used
to verify the state of the Web server nodes and manipulate the $serverlist
file accordingly.
The main section of this monitoring script performs three primary
functions. First, it queries each node homepage checking for availability:
my $req1 = new HTTP::Request GET => "http://$i.localdomain/index.html";
This line should be adjusted for your network environment. Checking
the retrieval of an actual Web page is a better check than a simple
ping or serial heartbeat as you verify both the server and service
(http). This check does not verify the entire Webtree, though. I use
mirroring scripts via rsync when transferring the Web tree from our
staging server to the production nodes to ensure each node has the
latest content and to check for errors during the transfer.
Second, it checks to see whether the homepage has been modified
on the master node and reports the differences:
if ($i eq "$master"){
check_hp(\$req1);
Third, it checks whether there has been any change in node and responds
accordingly. If a node becomes unavailable, it is removed from the
proxy and the administrator is alerted (Figure 4). If a node becomes
available, it is folded into the load balanced architecture automatically
and the administrator is alerted. This is determined by keeping track
of the previous node status (@nodesold), current node status (@nodesnew),
and the potential list of nodes (@nodeslist).
Because the script can be run manually, an administrator can comment
out servers in the servers-list.txt file and restart the proxy with
the appropriate servers to perform maintenance on servers while
they are out of the load balanced architecture. This does present
some issues when a static server is removed from the architecture
or otherwise becomes unavailable. If necessary, this can be handled
by having an alternate node designated as the alternate static server
during maintenance periods or node failure or by ensuring that all
nodes have identical content and services.
Proxy Server Failover
The node failover architecture still does not prevent a single
point of failure issue. Two points can clearly be identified: the
proxy server service (the running Squid daemon), and the actual
server and hardware components on which the proxy daemon runs. Listing
3 is a script that checks the Squid daemon and restarts it if necessary
and alerts the administrator. The script is somewhat extensible
and has been used for other proxy server applications in the past,
including an ftpd proxy.
There are many approaches to physical proxy server failover, from
network manipulation (at the router, switch, or DNS level) to software-based
solutions such as Andreas Müller's failover architecture (http://failover.othello.ch).
I prefer to perform mirroring of the production server to the failover
server using a Linux live CD such as JaCL Linux (http://www.northsecure.com/jacl/).
This approach allows you to boot the failover server off the CD
using an alternate IP address, periodically mirror the production
server to the failover while the failover disk is idle (the OS is
running off the CD), and perform proxy server failover simply by
restarting the failover box while the production server is idle.
Conclusion
This Squid reverse proxy-based load balancing and failover solution
has proven to be reliable in a production environment for more than
a year. Running Squid as an httpd accelerator alone has been stable
and effective in our production environment for more than five years.
Preliminary anecdotal tests show significant improvement in scalability
and response time with the load balanced architecture in place.
Using Siege with a randomized list of URLs, we were able to see
significant scalability improvements when utilizing the caching,
load balanced architecture compared to accessing the server directly.
Queries against a Tomcat Web application server that calls data
from an Oracle database could scale to no more than five simultaneous
users in some instances. The same query against the caching load
balanced architecture was able to accommodate 100 simulated concurrent
users before we stopped the test. Due to caching and load balancing,
the load on the nodes was significantly reduced, while the proxy
servers saw little or no system load while returning pages from
cache (the load never exceeded 10%). This architecture has been
tested under Red Hat Linux, Centos Linux, and OpenBSD architectures.
Load balanced nodes have been tested with Apache 1.3.x, 2.0.x and
Tomcat 4.x httpd servers successfully.
Overall, this approach to load balancing and failover in a stateful
environment has proven to be a cost-effective, efficient, scalable,
and relatively easy to implement solution. Other organizations may
want to consider this approach if they are finding their current
architecture and budget have been stretched beyond their means.
Acknowledgments
Thanks go to the Squid Web Proxy developers, the Perl developers
and scripters, Global Science and Technology Inc., NASA GSFC, Lola
Olsen, Gene Major, Len Newman, Chris Gokey, Steve Smith and, most
importantly, Susan, Lily, Robert, and Andrew.
References
Berners-Lee, Tim. Information Management: A Proposal. CERN. March
1989, May 1990 -- http://www.w3.org/History/1989/proposal.html.
Kumar, Rajeev. 2004. "Firewalling HTTP Traffic Using Reverse Squid
Proxy". Sys Admin, 13(2): pp. 21-26.
Smith, Stephen and Robert Northcutt. "Improving Multiple Site
Services". EOGEO2000. April 2000.
Zawodny, Jeremy. "Reverse Proxying with Squid". Linux Magazine.
August 2003 -- http://www.linux-mag.com/2003-08/lamp_01.html.
Squid Web Proxy -- http://www.squid-cache.org/
Andreas Mller's failover -- http://failover.othello.ch
Tom Northcutt is currently a full-time Unix System Administrator
for Global Science and Technology, Inc. at the NASA Goddard Space
Flight Center in Greenbelt, Maryland. He's been an avid Linux user
since 1993, where his background in Geographic Information Systems
led to the use of Linux during its emergence. Since 1993, he has
administered IRIX, Linux, OpenBSD, SunOS/Solaris, Tru64, and OS
X in a production environment. |