Control your Network Destiny: Why QoS, TS, and Other
Acronyms Matter
Greg Bledsoe
Did you know Linux has traffic control
capabilities that rival any commercial routing or quality of service
platform? I didn't until I needed to better control how my
home/business bandwidth was being used, and now I find this capability
extremely valuable. In this article, I'll cover some basic network
concepts, describe the elements of network traffic control, and provide
some examples of how to use built-in Linux tools to allow advanced control
and ensure different qualities of service (QoS) for different classes of
traffic.
To get started, let's consider a hypothetical
home/business network. Suppose you have a home network that is fairly
typical of a tech-savvy family in the digital age: broadband Internet
connection, several PCs, teenagers using chat and email and playing games
online, and maybe an http server, too. Or maybe it's a server to host
a network-capable game like FreeCiv. And, to top it off, let's say
you use Vonage or Skype or another VOIP provider for all your voice lines
over the same broadband connection. Did I mention that your wife
telecommutes? Okay, not so hypothetical -- this is what my home and business network looks like.
We have seven computers in the house with two servers accepting connections
and providing services. They all share one broadband connection for a wide
variety of purposes. As you might guess, our connection to the larger world
almost never hits 0% and quite often pegs out either up or downstream at
100%.
Now, let's say one afternoon you are in the
middle of a phone call with a potential client over VOIP, when suddenly the
service quality is reduced to an unusable state. While tracking down the
cause, you find your teenaged daughter in the other room emailing an
episode of her favorite TV show to a friend and wondering in wide-eyed
innocence why it's taking so long. To understand why uploading over a
broadband connection can kill VOIP quality, make interactive connections
useless, break VPNs, and eventually destroy download speeds, you have to
understand some basic network concepts. Armed with that knowledge, you can
take advantage of tools that are handily built in to the Linux kernel and
ensure that different traffic on your network receives different qualities
of service.
For purposes of this article, I'll assume that
you understand basic TCP/IP protocol function and the difference between
stateless protocols like UDP and connection-oriented ones like TCP. There
are many good resources on the Web if you need more information.
Packet Queuing
The basic problem described above lies in queuing. In
network parlance, it means packets are backing up and having to wait in
some device before they can be sent out over the wire. Broadband ISPs
typically provide you a lot more inbound bandwidth than outbound, because
most casual home users are data consumers and not providers, pulling more
traffic in than they send out. Packet loss in either direction can hurt
that all-important uni-directional speed, so it is fairly common for cable
modems and other broadband termination devices to have very long queues,
often configured with enough buffer space to
queue several seconds of traffic leaving your network and heading out to
the Internet. This allows for an overall speed-up for typical consumers, but is a very bad thing for interactive or
time-sensitive traffic like VPNs or VOIP.
Let's look at the case of the jumpy VOIP. Your
ear can detect auditory latency over 250 milliseconds or so, and it's
very sensitive to sounds that are jumbled out of order. VOIP technology
uses stateless UDP packets and throws away whatever takes longer than about
two-tenths of a second to arrive (excessive latency) and also discards
packets that arrive out of order (jitter). To work well, VOIP needs packets
to arrive in the order they were sent, without much delay, and in a pretty
steady stream with little variation in time between packets. It is fairly
obvious, then, why buffering even as little as half a second's worth
of data in your cable modem, because you are exceeding your 128-kbps upload
speed, makes your Vonage connection unusable. But why should that kill
downloads? Isn't that buffering specifically to speed those up?
While downloading from the Internet over a TCP
connection is much less sensitive to latency and is, in fact, helped by
occasional buffering to prevent loss from bursty traffic, in the case where
there is a steady stream of traffic exceeding upstream bandwidth for a
longer period of time than the broadband device can queue, TCP falls apart.
The device from which you are downloading sends an amount of data defined
by a parameter, called its "send window," which defines the
number of bytes it can transmit before it has to stop and wait for your end
to acknowledge what it just sent.
Modern TCP algorithms are pretty good at
self-adjusting to network conditions, but sustained latency is very
difficult to adjust to unless the device's TCP software is
preconfigured for it, for example, when there is a connection that must
always traverse a high-latency path, like a satellite feed. In this case,
incoming packets are getting to you just fine, you have headroom in the
downstream direction, but when you send your acknowledgment back it gets
queued behind two seconds or more of data already waiting to go out. Or
maybe the queue is full, and your ack packet gets dropped into the bit
bucket. Maybe the queue stays full, and it winds up taking an average of
three or four re-transmissions to get your acknowledgment through, which
translates into 6 to 10 seconds before the other end knows you got what it
sent and can resend. Depending on the timeout value configured for the
device, your TCP connection can break altogether and need to be restarted.
Basic Traffic Shaping
If only there were a way to prevent, or at least
control, this queuing. Well, there is, and for simple setups, you
don't even have to configure anything complicated. Most broadband
routers allow you to shape your upstream traffic, even going so far as to
give you the ability to prioritize certain source or destination ports
above other traffic. If you haven't turned on basic traffic shaping
in your linksys or d-link router, I suggest you do so now. It will greatly
improve your network performance overall if you set the router to send a
percent or two less than your ISP's allowed bandwidth. This will
prevent any queuing at all in the DSL or cable modem, greatly smoothing out
bumps in the road.
If you run your own Linux box as a firewall and/or
gateway device in front of your broadband connection, there is a handy
dandy little script called WonderShaper that will do the trick for you, and
it's available in most distributions' packaging systems. Just
apt-get, yum, or urpmi WonderShaper, configure /etc/wshaper.cfg,
and you can specify upstream bandwidth and prioritize certain traffic above
the rest.
In our typical broadband case where downstream
bandwidth is double or more what is available upstream, we need to tightly
control our outbound bandwidth use. Fortunately, this is what QoS and TS
are for. We have the most control over what we send out. Controlling
inbound bandwidth is like contacting everyone that could possibly send you
mail through the post office and asking them all to send only so many
packages per week. We do have a technique called "ingress
policing" at our disposal, which takes advantage of the way the TCP
protocol works to simulate a slower link for certain types of traffic,
making the TCP algorithm back off the rate of
traffic and thereby controlling inbound bandwidth to a degree. I
won't be able to cover that here, but the documentation can be found in the References section.
Befitting the power and flexibility of the solutions
available to us, there are many ways to configure a working solution that
does what we want with our outbound traffic. The absolute simplest
configuration we could use to prioritize our traffic would be to honor the
type of service (TOS) field in the IP header. However, this is problematic for several reasons, not the least of which
is that it can be manipulated by user space programs, and any ill-behaved application can tell your network to give it highest
priority. In other words, it can't be trusted. So we throw out that
idea first thing. Of the solutions we might actually choose, the oldest and
most complex one would involve using class-based queuing (CBQ) in
combination with token bucket filters (TBF) and a fairness queuing
algorithm like stochastic fairness queuing (SFQ). Now let's look at
these tools more closely.
SFQ
SFQ prevents any particular "flow" of data
from dominating a link. Sometimes an aggressively configured TCP stack can
send lots of data at one time and "crowd out" more sensibly
configured devices. If your link is heavily utilized, this has an adverse
affect on the network. The SFQ algorithm identifies particular
conversations by address:port pairs and gives each flow round-robin access
to send, so it is an almost perfectly fair way to share bandwidth. On the
other hand, if your link isn't close to full, it doesn't do
anything.
TBF
TBF is a very precise way to control the rate at which
a particular type of traffic is sent. If all you need to do is slow down an
interface, this is the way to do it. It works in a very common-sense way:
think of the queue as a bucket. Tokens are dropped into the bucket at a
predetermined rate. Each token is like a hall pass, a permission slip to
send a packet. When you aren't sending, you can accumulate tokens up
to a certain number, meaning after a period of inactivity, you can send
faster while you have tokens left, but once you've used them all, you
have to wait until another token drops into the bucket. This gives you the
ability to burst for a short period while ensuring your average use
doesn't greatly exceed the desired rate and cause problems.
CBQ
Then we come to the venerable CBQ. CBQ sets up
"classes" of traffic based on the filters you specify. It has
lots of knobs, most of which are only marginally documented, and a slew of
parameters that have to be specified just to make it work. Because of that,
even a very simple configuration immediately becomes complex with CBQ. It
has the added disadvantage that its shaping algorithm quite often
miscalculates and is resultantly unpredictable and inaccurate. Fortunately,
this is not the only classful queuing discipline (qdisc) available to us.
HTB
Hierarchical token bucket (HTB) is another classful
qdisc specifically designed to replace CBQ. It is simpler, far more
accurate in its shaping, has fewer parameters and better documentation, and
as of version 3, performs equally well under load. If you are running a
modern kernel (late 2.4.x or 2.6) then this is probably the way you want to
go. But enough conceptual stuff. So now let's code.
Err, just one problem. We have to define what we want
to do and figure out what our classes will look like before we can figure
out what our configuration script will need to be. We also have to
understand that each interface has one egress "root qdisc." By
default, this is just a plain first-in-first-out (FIFO) qdisc until you
make it something else. Each qdisc and class is assigned a handle, which
can be used by later configuration statements to refer to it. Handles
consist of two parts, a major number and a minor number, like <major>:
<minor>. It is customary to name the root qdisc '1:',
which is the same as '1:0'. Think of it like writing one point
zero. You don't need the point zero unless it's something other
than zero. The minor number of a qdisc is always 0. Classes are created
under qdiscs and traffic assigned to them with filters. The classes under a
qdisc must have the same major number and a different minor number, like 1:
1 and 1:100. A class can then contain another qdisc, which has a different
major number than its parent (e.g., 10: or 100:).
Configuration
Now let's finally decide exactly what we want
to do. For our network configuration, we want to prioritize VOIP traffic
over everything, next we want VPN traffic to have a good portion of the
bandwidth, then we want to guarantee our HTTP traffic some bandwidth,
followed by everything else. So, what we want is this:
1:0 root qdisc (htb)
|
1:1 -----1:4
/ | \ |
/ | \ |
/ | \ |
1:10 1:20 1:30 1:40 classes (htb)
| | | |
10: 20: 30: 40: qdiscs
VOIP VPN HTTP ALL
fifo fifo sfq sfq
At this point, we can start configuring. You can link
a script at the end of /etc/rc.local or get really fancy and put one in
/etc/init.d/. To configure traffic shaping, we use the "tc"
command as root (try "man tc" in the console) for all
configuration:
# tc qdisc add dev eth0 root handle 1: htb default 40
This creates our root qdisc and tells it that any
otherwise unclassified traffic gets assigned to class 1:40. Next, we create
our classes:
# tc class add dev eth0 parent 1: classid 1:1 htb rate 120kbps \
ceil 120kbps
If we have 128 kilobyte/sec upload (as is fairly
common), 120 would be a good rate to use to allow for potential cell
overhead, etc. The "rate" parameter is what this particular
class is guaranteed to have, while "ceil" is what it can never
go beyond. To ensure overall shaping, we could stop right here.
# tc class add dev eth0 parent 1:1 classid 1:10 htb rate 20kbps \
ceil 120kbps
To our first class, we just added an htb qdisc and
guaranteed it 20 kbytes/sec, while allowing it to "borrow" as
much as it might need to burst.
# tc class add dev eth0 parent 1:1 classid 1:20 htb rate 50kbps \
ceil 100kbps
We gave this class a guaranteed rate of 50 kbytes/sec,
and it can also borrow up to 100 kbytes/sec, but will always leave at least
20 kbytes/sec.
# tc class add dev eth0 parent 1:1 classid 1:30 htb rate 50kbps \
ceil 100kbps
The class intended for http traffic now has 50
kbytes/sec forever its own and, like the previous class, can borrow up to
100 kbytes/sec.
# tc class add dev eth0 parent 1:1 classid 1:30 htb rate 1kbps \
ceil 100kbps
We gave our default class (which will catch that email
attachment that gave us so much trouble before) no real guarantee, but the
ability to borrow when no one else needs to send data. If all classes are
operating above their "rate", they will share the bandwidth
under 100 with a 50:50:1 ratio as a function of their configured rate. This
has a lot of implications to consider in more complex scenarios.
Note that there will always be 20 kbytes per second
available to VOIP above and beyond anything else, but right now we do
nothing to stop runaway connections within classes as all classes have the
default fifo qdisc. This is what we want for our VOIP and VPN classes,
because they will only carry one connection under most circumstances. For
the others, which might carry any number of conversations at any given
time, we need to do a little something more.
# tc qdisc add dev eth0 parent 1:30 handle 30: sfq
perturb 10
We just replaced the default fifo qdisc under 1:30
with an sfq. The "perturb" parameter specifies how often to
change hashing algorithms to protect against hash collisions that might
deprive a flow of its rightful bandwidth. We've configured it to
change every 10 seconds.
# tc qdisc add dev eth0 parent 1:40 handle 40: sfq
perturb 10
And we just did the same thing with our catchall
class, replacing the fifo qdisc. Now all flows will share equally in
whatever bandwidth this class has at any given moment. But we still
haven't configured our filters to actually assign traffic.
So, let's say our VOIP is configured for sip on
udp port 5060 and rtp on udp port 10000. This filter will send it to class
1:10:
# TC='tc filter add dev eth0 protocol ip parent 1:0 prio 1 u32'
# $TC match ip dport 5060 0xffff match ip dport 10000 0xffff \
flowid 1:10
Our VPN connects to port 1072:
# $TC match ip dport 1072 0xffff flowid 1:20
And as we designed, http and https will go to the third class:
# $TC match ip sport 80 0xffff match ip sport 443 0xffff flowid 1:30
We don't have to configure a filter for 1:40
because it is the default. It is our "catchall" class.
To look at what we have configured and get statistics,
we can use tc -s qdisc ls dev eth0.
Conclusion
I've shown a very simple example of what is
possible using Linux's advanced traffic control features. Classes and
qdiscs can be combined and recombined with infinite variety. This short
article simply doesn't allow space to examine the subject in more
depth, but you should now have the knowledge to begin configuring simple
traffic control on your own networks and understand the concepts well
enough to better understand the implications of design choices. To study
the subject more in depth, check out the documents in the References
section then go forth and multiply your classes and qdiscs!
References
HTB homepage -- http://luxik.cdi.cz/~devik/qos/htb/
Linux Advanced Routing & Traffic Control -- http://lartc.org/
Linux: Advanced Networking Overview -- http://qos.ittc.ku.edu/howto/index.html
Linux DiffServ homepage -- http://www.opalsoft.net/qos/DS.htm
Linux Traffic Control HOWTO -- http://linux-ip.net/articles/Traffic-Control-HOWTO/index.html
Greg Bledsoe has spent his career designing,
implementing, and troubleshooting networks, tackling whatever technical
problems have struck his fancy, and coding up interesting bits now and
then. Now he has struck out on his own and writes articles to supplement
his income. He and his lovely wife have five home-schooled children, so he
has no spare time for hobbies but imagines if he did he'd get in
better shape. Greg can be reached at: QoS.article@gmail.com.
|