Blame man/man7/tc-hfsc.7

Packit Service 3880ab
.TH "TC\-HFSC" 7 "31 October 2011" iproute2 Linux
Packit Service 3880ab
.SH "NAME"
Packit Service 3880ab
tc-hfcs \- Hierarchical Fair Service Curve
Packit Service 3880ab
.
Packit Service 3880ab
.SH "HISTORY & INTRODUCTION"
Packit Service 3880ab
.
Packit Service 3880ab
HFSC (Hierarchical Fair Service Curve) is a network packet scheduling algorithm that was first presented at
Packit Service 3880ab
SIGCOMM'97. Developed as a part of ALTQ (ALTernative Queuing) on NetBSD, found
Packit Service 3880ab
its way quickly to other BSD systems, and then a few years ago became part of
Packit Service 3880ab
the linux kernel. Still, it's not the most popular scheduling algorithm \-
Packit Service 3880ab
especially if compared to HTB \- and it's not well documented for the enduser. This introduction aims to explain how HFSC works without using
Packit Service 3880ab
too much math (although some math it will be
Packit Service 3880ab
inevitable).
Packit Service 3880ab
Packit Service 3880ab
In short HFSC aims to:
Packit Service 3880ab
.
Packit Service 3880ab
.RS 4
Packit Service 3880ab
.IP \fB1)\fR 4
Packit Service 3880ab
guarantee precise bandwidth and delay allocation for all leaf classes (realtime
Packit Service 3880ab
criterion)
Packit Service 3880ab
.IP \fB2)\fR
Packit Service 3880ab
allocate excess bandwidth fairly as specified by class hierarchy (linkshare &
Packit Service 3880ab
upperlimit criterion)
Packit Service 3880ab
.IP \fB3)\fR
Packit Service 3880ab
minimize any discrepancy between the service curve and the actual amount of
Packit Service 3880ab
service provided during linksharing
Packit Service 3880ab
.RE
Packit Service 3880ab
.PP
Packit Service 3880ab
.
Packit Service 3880ab
The main "selling" point of HFSC is feature \fB(1)\fR, which is achieved by
Packit Service 3880ab
using nonlinear service curves (more about what it actually is later). This is
Packit Service 3880ab
particularly useful in VoIP or games, where not only a guarantee of consistent
Packit Service 3880ab
bandwidth is important, but also limiting the initial delay of a data stream. Note that
Packit Service 3880ab
it matters only for leaf classes (where the actual queues are) \- thus class
Packit Service 3880ab
hierarchy is ignored in the realtime case.
Packit Service 3880ab
Packit Service 3880ab
Feature \fB(2)\fR is well, obvious \- any algorithm featuring class hierarchy
Packit Service 3880ab
(such as HTB or CBQ) strives to achieve that. HFSC does that well, although
Packit Service 3880ab
you might end with unusual situations, if you define service curves carelessly
Packit Service 3880ab
\- see section CORNER CASES for examples.
Packit Service 3880ab
Packit Service 3880ab
Feature \fB(3)\fR is mentioned due to the nature of the problem. There may be
Packit Service 3880ab
situations where it's either not possible to guarantee service of all curves at
Packit Service 3880ab
the same time, and/or it's impossible to do so fairly. Both will be explained
Packit Service 3880ab
later. Note that this is mainly related to interior (aka aggregate) classes, as
Packit Service 3880ab
the leafs are already handled by \fB(1)\fR. Still, it's perfectly possible to
Packit Service 3880ab
create a leaf class without realtime service, and in such a case the caveats will
Packit Service 3880ab
naturally extend to leaf classes as well.
Packit Service 3880ab
Packit Service 3880ab
.SH ABBREVIATIONS
Packit Service 3880ab
For the remaining part of the document, we'll use following shortcuts:
Packit Service 3880ab
.nf
Packit Service 3880ab
.RS 4
Packit Service 3880ab
Packit Service 3880ab
RT \- realtime
Packit Service 3880ab
LS \- linkshare
Packit Service 3880ab
UL \- upperlimit
Packit Service 3880ab
SC \- service curve
Packit Service 3880ab
.RE
Packit Service 3880ab
.fi
Packit Service 3880ab
.
Packit Service 3880ab
.SH "BASICS OF HFSC"
Packit Service 3880ab
.
Packit Service 3880ab
To understand how HFSC works, we must first introduce a service curve.
Packit Service 3880ab
Overall, it's a nondecreasing function of some time unit, returning the amount
Packit Service 3880ab
of
Packit Service 3880ab
service (an allowed or allocated amount of bandwidth) at some specific point in
Packit Service 3880ab
time. The purpose of it should be subconsciously obvious: if a class was
Packit Service 3880ab
allowed to transfer not less than the amount specified by its service curve,
Packit Service 3880ab
then the service curve is not violated.
Packit Service 3880ab
Packit Service 3880ab
Still, we need more elaborate criterion than just the above (although in
Packit Service 3880ab
the most generic case it can be reduced to it). The criterion has to take two
Packit Service 3880ab
things into account:
Packit Service 3880ab
.
Packit Service 3880ab
.RS 4
Packit Service 3880ab
.IP \(bu 4
Packit Service 3880ab
idling periods
Packit Service 3880ab
.IP \(bu
Packit Service 3880ab
the ability to "look back", so if during current active period the service curve is violated, maybe it
Packit Service 3880ab
isn't if we count excess bandwidth received during earlier active period(s)
Packit Service 3880ab
.RE
Packit Service 3880ab
.PP
Packit Service 3880ab
Let's define the criterion as follows:
Packit Service 3880ab
.RS 4
Packit Service 3880ab
.nf
Packit Service 3880ab
.IP "\fB(1)\fR" 4
Packit Service 3880ab
For each t1, there must exist t0 in set B, so S(t1\-t0)\~<=\~w(t0,t1)
Packit Service 3880ab
.fi
Packit Service 3880ab
.RE
Packit Service 3880ab
.
Packit Service 3880ab
.PP
Packit Service 3880ab
Here 'w' denotes the amount of service received during some time period between t0
Packit Service 3880ab
and t1. B is a set of all times, where a session becomes active after idling
Packit Service 3880ab
period (further denoted as 'becoming backlogged'). For a clearer picture,
Packit Service 3880ab
imagine two situations:
Packit Service 3880ab
.
Packit Service 3880ab
.RS 4
Packit Service 3880ab
.IP \fBa)\fR 4
Packit Service 3880ab
our session was active during two periods, with a small time gap between them
Packit Service 3880ab
.IP \fBb)\fR
Packit Service 3880ab
as in (a), but with a larger gap
Packit Service 3880ab
.RE
Packit Service 3880ab
.
Packit Service 3880ab
.PP
Packit Service 3880ab
Consider \fB(a)\fR: if the service received during both periods meets
Packit Service 3880ab
\fB(1)\fR, then all is well. But what if it doesn't do so during the 2nd
Packit Service 3880ab
period? If the amount of service received during the 1st period is larger
Packit Service 3880ab
than the service curve, then it might compensate for smaller service during
Packit Service 3880ab
the 2nd period \fIand\fR the gap \- if the gap is small enough.
Packit Service 3880ab
Packit Service 3880ab
If the gap is larger \fB(b)\fR \- then it's less likely to happen (unless the
Packit Service 3880ab
excess bandwidth allocated during the 1st part was really large). Still, the
Packit Service 3880ab
larger the gap \- the less interesting is what happened in the past (e.g. 10
Packit Service 3880ab
minutes ago) \- what matters is the current traffic that just started.
Packit Service 3880ab
Packit Service 3880ab
From HFSC's perspective, more interesting is answering the following question:
Packit Service 3880ab
when should we start transferring packets, so a service curve of a class is not
Packit Service 3880ab
violated. Or rephrasing it: How much X() amount of service should a session
Packit Service 3880ab
receive by time t, so the service curve is not violated. Function X() defined
Packit Service 3880ab
as below is the basic building block of HFSC, used in: eligible, deadline,
Packit Service 3880ab
virtual\-time and fit\-time curves. Of course, X() is based on equation
Packit Service 3880ab
\fB(1)\fR and is defined recursively:
Packit Service 3880ab
Packit Service 3880ab
.RS 4
Packit Service 3880ab
.IP \(bu 4
Packit Service 3880ab
At the 1st backlogged period beginning function X is initialized to generic
Packit Service 3880ab
service curve assigned to a class
Packit Service 3880ab
.IP \(bu
Packit Service 3880ab
At any subsequent backlogged period, X() is:
Packit Service 3880ab
.nf
Packit Service 3880ab
\fBmin(X() from previous period ; w(t0)+S(t\-t0) for t>=t0),\fR
Packit Service 3880ab
.fi
Packit Service 3880ab
\&... where t0 denotes the beginning of the current backlogged period.
Packit Service 3880ab
.RE
Packit Service 3880ab
.
Packit Service 3880ab
.PP
Packit Service 3880ab
HFSC uses either linear, or two\-piece linear service curves. In case of
Packit Service 3880ab
linear or two\-piece linear convex functions (first slope < second slope),
Packit Service 3880ab
min() in X's definition reduces to the 2nd argument. But in case of two\-piece
Packit Service 3880ab
concave functions, the 1st argument might quickly become lesser for some
Packit Service 3880ab
t>=t0. Note, that for some backlogged period, X() is defined only from that
Packit Service 3880ab
period's beginning. We also define X^(\-1)(w) as smallest t>=t0, for which
Packit Service 3880ab
X(t)\~=\~w. We have to define it this way, as X() is usually not an injection.
Packit Service 3880ab
Packit Service 3880ab
The above generic X() can be one of the following:
Packit Service 3880ab
.
Packit Service 3880ab
.RS 4
Packit Service 3880ab
.IP "E()" 4
Packit Service 3880ab
In realtime criterion, selects packets eligible for sending. If none are
Packit Service 3880ab
eligible, HFSC will use linkshare criterion. Eligible time \&'et' is calculated
Packit Service 3880ab
with reference to packets' heads ( et\~=\~E^(\-1)(w) ). It's based on RT
Packit Service 3880ab
service curve, \fIbut in case of a convex curve, uses its 2nd slope only.\fR
Packit Service 3880ab
.IP "D()"
Packit Service 3880ab
In realtime criterion, selects the most suitable packet from the ones chosen
Packit Service 3880ab
by E(). Deadline time \&'dt' corresponds to packets' tails
Packit Service 3880ab
(dt\~=\~D^(\-1)(w+l), where \&'l' is packet's length). Based on RT service
Packit Service 3880ab
curve.
Packit Service 3880ab
.IP "V()"
Packit Service 3880ab
In linkshare criterion, arbitrates which packet to send next. Note that V() is
Packit Service 3880ab
function of a virtual time \- see \fBLINKSHARE CRITERION\fR section for
Packit Service 3880ab
details. Virtual time \&'vt' corresponds to packets' heads
Packit Service 3880ab
(vt\~=\~V^(\-1)(w)). Based on LS service curve.
Packit Service 3880ab
.IP "F()"
Packit Service 3880ab
An extension to linkshare criterion, used to limit at which speed linkshare
Packit Service 3880ab
criterion is allowed to dequeue. Fit\-time 'ft' corresponds to packets' heads
Packit Service 3880ab
as well (ft\~=\~F^(\-1)(w)). Based on UL service curve.
Packit Service 3880ab
.RE
Packit Service 3880ab
Packit Service 3880ab
Be sure to make clean distinction between session's RT, LS and UL service
Packit Service 3880ab
curves and the above "utility" functions.
Packit Service 3880ab
.
Packit Service 3880ab
.SH "REALTIME CRITERION"
Packit Service 3880ab
.
Packit Service 3880ab
RT criterion \fIignores class hierarchy\fR and guarantees precise bandwidth and
Packit Service 3880ab
delay allocation. We say that a packet is eligible for sending, when the
Packit Service 3880ab
current real
Packit Service 3880ab
time is later than the eligible time of the packet. From all eligible packets, the one most
Packit Service 3880ab
suited for sending is the one with the shortest deadline time. This sounds
Packit Service 3880ab
simple, but consider the following example:
Packit Service 3880ab
Packit Service 3880ab
Interface 10Mbit, two classes, both with two\-piece linear service curves:
Packit Service 3880ab
.RS 4
Packit Service 3880ab
.IP \(bu 4
Packit Service 3880ab
1st class \- 2Mbit for 100ms, then 7Mbit (convex \- 1st slope < 2nd slope)
Packit Service 3880ab
.IP \(bu
Packit Service 3880ab
2nd class \- 7Mbit for 100ms, then 2Mbit (concave \- 1st slope > 2nd slope)
Packit Service 3880ab
.RE
Packit Service 3880ab
.PP
Packit Service 3880ab
Assume for a moment, that we only use D() for both finding eligible packets,
Packit Service 3880ab
and choosing the most fitting one, thus eligible time would be computed as
Packit Service 3880ab
D^(\-1)(w) and deadline time would be computed as D^(\-1)(w+l). If the 2nd
Packit Service 3880ab
class starts sending packets 1 second after the 1st class, it's of course
Packit Service 3880ab
impossible to guarantee 14Mbit, as the interface capability is only 10Mbit.
Packit Service 3880ab
The only workaround in this scenario is to allow the 1st class to send the
Packit Service 3880ab
packets earlier that would normally be allowed. That's where separate E() comes
Packit Service 3880ab
to help. Putting all the math aside (see HFSC paper for details), E() for RT
Packit Service 3880ab
concave service curve is just like D(), but for the RT convex service curve \-
Packit Service 3880ab
it's constructed using \fIonly\fR RT service curve's 2nd slope (in our example
Packit Service 3880ab
 7Mbit).
Packit Service 3880ab
Packit Service 3880ab
The effect of such E() \- packets will be sent earlier, and at the same time
Packit Service 3880ab
D() \fIwill\fR be updated \- so the current deadline time calculated from it
Packit Service 3880ab
will be later. Thus, when the 2nd class starts sending packets later, both
Packit Service 3880ab
the 1st and the 2nd class will be eligible, but the 2nd session's deadline
Packit Service 3880ab
time will be smaller and its packets will be sent first. When the 1st class
Packit Service 3880ab
becomes idle at some later point, the 2nd class will be able to "buffer" up
Packit Service 3880ab
again for later active period of the 1st class.
Packit Service 3880ab
Packit Service 3880ab
A short remark \- in a situation, where the total amount of bandwidth
Packit Service 3880ab
available on the interface is larger than the allocated total realtime parts
Packit Service 3880ab
(imagine a 10 Mbit interface, but 1Mbit/2Mbit and 2Mbit/1Mbit classes), the sole
Packit Service 3880ab
speed of the interface could suffice to guarantee the times.
Packit Service 3880ab
Packit Service 3880ab
Important part of RT criterion is that apart from updating its D() and E(),
Packit Service 3880ab
also V() used by LS criterion is updated. Generally the RT criterion is
Packit Service 3880ab
secondary to LS one, and used \fIonly\fR if there's a risk of violating precise
Packit Service 3880ab
realtime requirements. Still, the "participation" in bandwidth distributed by
Packit Service 3880ab
LS criterion is there, so V() has to be updated along the way. LS criterion can
Packit Service 3880ab
than properly compensate for non\-ideal fair sharing situation, caused by RT
Packit Service 3880ab
scheduling. If you use UL service curve its F() will be updated as well (UL
Packit Service 3880ab
service curve is an extension to LS one \- see \fBUPPERLIMIT CRITERION\fR
Packit Service 3880ab
section).
Packit Service 3880ab
Packit Service 3880ab
Anyway \- careless specification of LS and RT service curves can lead to
Packit Service 3880ab
potentially undesired situations (see CORNER CASES for examples). This wasn't
Packit Service 3880ab
the case in HFSC paper where LS and RT service curves couldn't be specified
Packit Service 3880ab
separately.
Packit Service 3880ab
Packit Service 3880ab
.SH "LINKSHARING CRITERION"
Packit Service 3880ab
.
Packit Service 3880ab
LS criterion's task is to distribute bandwidth according to specified class
Packit Service 3880ab
hierarchy. Contrary to RT criterion, there're no comparisons between current
Packit Service 3880ab
real time and virtual time \- the decision is based solely on direct comparison
Packit Service 3880ab
of virtual times of all active subclasses \- the one with the smallest vt wins
Packit Service 3880ab
and gets scheduled. One immediate conclusion from this fact is that absolute
Packit Service 3880ab
values don't matter \- only ratios between them (so for example, two children
Packit Service 3880ab
classes with simple linear 1Mbit service curves will get the same treatment
Packit Service 3880ab
from LS criterion's perspective, as if they were 5Mbit). The other conclusion
Packit Service 3880ab
is, that in perfectly fluid system with linear curves, all virtual times across
Packit Service 3880ab
whole class hierarchy would be equal.
Packit Service 3880ab
Packit Service 3880ab
Why is VC defined in term of virtual time (and what is it)?
Packit Service 3880ab
Packit Service 3880ab
Imagine an example: class A with two children \- A1 and A2, both with let's say
Packit Service 3880ab
10Mbit SCs. If A2 is idle, A1 receives all the bandwidth of A (and update its
Packit Service 3880ab
V() in the process). When A2 becomes active, A1's virtual time is already
Packit Service 3880ab
\fIfar\fR later than A2's one. Considering the type of decision made by LS
Packit Service 3880ab
criterion, A1 would become idle for a long time. We can workaround this
Packit Service 3880ab
situation by adjusting virtual time of the class becoming active \- we do that
Packit Service 3880ab
by getting such time "up to date". HFSC uses a mean of the smallest and the
Packit Service 3880ab
biggest virtual time of currently active children fit for sending. As it's not
Packit Service 3880ab
real time anymore (excluding trivial case of situation where all classes become
Packit Service 3880ab
active at the same time, and never become idle), it's called virtual time.
Packit Service 3880ab
Packit Service 3880ab
Such approach has its price though. The problem is analogous to what was
Packit Service 3880ab
presented in previous section and is caused by non\-linearity of service
Packit Service 3880ab
curves:
Packit Service 3880ab
.IP 1) 4
Packit Service 3880ab
either it's impossible to guarantee service curves and satisfy fairness
Packit Service 3880ab
during certain time periods:
Packit Service 3880ab
Packit Service 3880ab
.RS 4
Packit Service 3880ab
Recall the example from RT section, slightly modified (with 3Mbit slopes
Packit Service 3880ab
instead of 2Mbit ones):
Packit Service 3880ab
Packit Service 3880ab
.IP \(bu 4
Packit Service 3880ab
1st class \- 3Mbit for 100ms, then 7Mbit (convex \- 1st slope < 2nd slope)
Packit Service 3880ab
.IP \(bu
Packit Service 3880ab
2nd class \- 7Mbit for 100ms, then 3Mbit (concave \- 1st slope > 2nd slope)
Packit Service 3880ab
Packit Service 3880ab
.PP
Packit Service 3880ab
They sum up nicely to 10Mbit \- the interface's capacity. But if we wanted to only
Packit Service 3880ab
use LS for guarantees and fairness \- it simply won't work. In LS context,
Packit Service 3880ab
only V() is used for making decision which class to schedule. If the 2nd class
Packit Service 3880ab
becomes active when the 1st one is in its second slope, the fairness will be
Packit Service 3880ab
preserved \- ratio will be 1:1 (7Mbit:7Mbit), but LS itself is of course
Packit Service 3880ab
unable to guarantee the absolute values themselves \- as it would have to go
Packit Service 3880ab
beyond of what the interface is capable of.
Packit Service 3880ab
.RE
Packit Service 3880ab
Packit Service 3880ab
.IP 2) 4
Packit Service 3880ab
and/or it's impossible to guarantee service curves of all classes at the same
Packit Service 3880ab
time [fairly or not]:
Packit Service 3880ab
Packit Service 3880ab
.RS 4
Packit Service 3880ab
Packit Service 3880ab
This is similar to the above case, but a bit more subtle. We will consider two
Packit Service 3880ab
subtrees, arbitrated by their common (root here) parent:
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
R (root) -\ 10Mbit
Packit Service 3880ab
Packit Service 3880ab
A  \- 7Mbit, then 3Mbit
Packit Service 3880ab
A1 \- 5Mbit, then 2Mbit
Packit Service 3880ab
A2 \- 2Mbit, then 1Mbit
Packit Service 3880ab
Packit Service 3880ab
B  \- 3Mbit, then 7Mbit
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
R arbitrates between left subtree (A) and right (B). Assume that A2 and B are
Packit Service 3880ab
constantly backlogged, and at some later point A1 becomes backlogged (when all
Packit Service 3880ab
other classes are in their 2nd linear part).
Packit Service 3880ab
Packit Service 3880ab
What happens now? B (choice made by R) will \fIalways\fR get 7 Mbit as R is
Packit Service 3880ab
only (obviously) concerned with the ratio between its direct children. Thus A
Packit Service 3880ab
subtree gets 3Mbit, but its children would want (at the point when A1 became
Packit Service 3880ab
backlogged) 5Mbit + 1Mbit. That's of course impossible, as they can only get
Packit Service 3880ab
3Mbit due to interface limitation.
Packit Service 3880ab
Packit Service 3880ab
In the left subtree \- we have the same situation as previously (fair split
Packit Service 3880ab
between A1 and A2, but violated guarantees), but in the whole tree \- there's
Packit Service 3880ab
no fairness (B got 7Mbit, but A1 and A2 have to fit together in 3Mbit) and
Packit Service 3880ab
there's no guarantees for all classes (only B got what it wanted). Even if we
Packit Service 3880ab
violated fairness in the A subtree and set A2's service curve to 0, A1 would
Packit Service 3880ab
still not get the required bandwidth.
Packit Service 3880ab
.RE
Packit Service 3880ab
.
Packit Service 3880ab
.SH "UPPERLIMIT CRITERION"
Packit Service 3880ab
.
Packit Service 3880ab
UL criterion is an extensions to LS one, that permits sending packets only
Packit Service 3880ab
if current real time is later than fit\-time ('ft'). So the modified LS
Packit Service 3880ab
criterion becomes: choose the smallest virtual time from all active children,
Packit Service 3880ab
such that fit\-time < current real time also holds. Fit\-time is calculated
Packit Service 3880ab
from F(), which is based on UL service curve. As you can see, its role is
Packit Service 3880ab
kinda similar to E() used in RT criterion. Also, for obvious reasons \- you
Packit Service 3880ab
can't specify UL service curve without LS one.
Packit Service 3880ab
Packit Service 3880ab
The main purpose of the UL service curve is to limit HFSC to bandwidth available on the
Packit Service 3880ab
upstream router (think adsl home modem/router, and linux server as
Packit Service 3880ab
NAT/firewall/etc. with 100Mbit+ connection to mentioned modem/router).
Packit Service 3880ab
Typically, it's used to create a single class directly under root, setting
Packit Service 3880ab
a linear UL service curve to available bandwidth \- and then creating your class
Packit Service 3880ab
structure from that class downwards. Of course, you're free to add a UL service
Packit Service 3880ab
curve (linear or not) to any class with LS criterion.
Packit Service 3880ab
Packit Service 3880ab
An important part about the UL service curve is that whenever at some point in time
Packit Service 3880ab
a class doesn't qualify for linksharing due to its fit\-time, the next time it
Packit Service 3880ab
does qualify it will update its virtual time to the smallest virtual time of
Packit Service 3880ab
all active children fit for linksharing. This way, one of the main things the LS
Packit Service 3880ab
criterion tries to achieve \- equality of all virtual times across whole
Packit Service 3880ab
hierarchy \- is preserved (in perfectly fluid system with only linear curves,
Packit Service 3880ab
all virtual times would be equal).
Packit Service 3880ab
Packit Service 3880ab
Without that, 'vt' would lag behind other virtual times, and could cause
Packit Service 3880ab
problems. Consider an interface with a capacity of 10Mbit, and the following leaf classes
Packit Service 3880ab
(just in case you're skipping this text quickly \- this example shows behavior
Packit Service 3880ab
that \f(BIdoesn't happen\fR):
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
A \- ls 5.0Mbit
Packit Service 3880ab
B \- ls 2.5Mbit
Packit Service 3880ab
C \- ls 2.5Mbit, ul 2.5Mbit
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
If B was idle, while A and C were constantly backlogged, A and C would normally
Packit Service 3880ab
(as far as LS criterion is concerned) divide bandwidth in 2:1 ratio. But due
Packit Service 3880ab
to UL service curve in place, C would get at most 2.5Mbit, and A would get the
Packit Service 3880ab
remaining 7.5Mbit. The longer the backlogged period, the more the virtual times of
Packit Service 3880ab
A and C would drift apart. If B became backlogged at some later point in time,
Packit Service 3880ab
its virtual time would be set to (A's\~vt\~+\~C's\~vt)/2, thus blocking A from
Packit Service 3880ab
sending any traffic until B's virtual time catches up with A.
Packit Service 3880ab
.
Packit Service 3880ab
.SH "SEPARATE LS / RT SCs"
Packit Service 3880ab
.
Packit Service 3880ab
Another difference from the original HFSC paper is that RT and LS SCs can be
Packit Service 3880ab
specified separately. Moreover, leaf classes are allowed to have only either
Packit Service 3880ab
RT SC or LS SC. For interior classes, only LS SCs make sense: any RT SC will
Packit Service 3880ab
be ignored.
Packit Service 3880ab
.
Packit Service 3880ab
.SH "CORNER CASES"
Packit Service 3880ab
.
Packit Service 3880ab
Separate service curves for LS and RT criteria can lead to certain traps
Packit Service 3880ab
that come from "fighting" between ideal linksharing and enforced realtime
Packit Service 3880ab
guarantees. Those situations didn't exist in original HFSC paper, where
Packit Service 3880ab
specifying separate LS / RT service curves was not discussed.
Packit Service 3880ab
Packit Service 3880ab
Consider an interface with a 10Mbit capacity, with the following leaf classes:
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
A \- ls 5.0Mbit, rt 8Mbit
Packit Service 3880ab
B \- ls 2.5Mbit
Packit Service 3880ab
C \- ls 2.5Mbit
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
Imagine A and C are constantly backlogged. As B is idle, A and C would divide
Packit Service 3880ab
bandwidth in 2:1 ratio, considering LS service curve (so in theory \- 6.66 and
Packit Service 3880ab
3.33). Alas RT criterion takes priority, so A will get 8Mbit and LS will be
Packit Service 3880ab
able to compensate class C for only 2 Mbit \- this will cause discrepancy
Packit Service 3880ab
between virtual times of A and C.
Packit Service 3880ab
Packit Service 3880ab
Assume this situation lasts for a long time with no idle periods, and
Packit Service 3880ab
suddenly B becomes active. B's virtual time will be updated to
Packit Service 3880ab
(A's\~vt\~+\~C's\~vt)/2, effectively landing in the middle between A's and C's
Packit Service 3880ab
virtual time. The effect \- B, having no RT guarantees, will be punished and
Packit Service 3880ab
will not be allowed to transfer until C's virtual time catches up.
Packit Service 3880ab
Packit Service 3880ab
If the interface had a higher capacity, for example 100Mbit, this example
Packit Service 3880ab
would behave perfectly fine though.
Packit Service 3880ab
Packit Service 3880ab
Let's look a bit closer at the above example \- it "cleverly" invalidates one
Packit Service 3880ab
of the basic things LS criterion tries to achieve \- equality of all virtual
Packit Service 3880ab
times across class hierarchy. Leaf classes without RT service curves are
Packit Service 3880ab
literally left to their own fate (governed by messed up virtual times).
Packit Service 3880ab
Packit Service 3880ab
Also, it doesn't make much sense. Class A will always be guaranteed up to
Packit Service 3880ab
8Mbit, and this is more than any absolute bandwidth that could happen from its
Packit Service 3880ab
LS criterion (excluding trivial case of only A being active). If the bandwidth
Packit Service 3880ab
taken by A is smaller than absolute value from LS criterion, the unused part
Packit Service 3880ab
will be automatically assigned to other active classes (as A has idling periods
Packit Service 3880ab
in such case). The only "advantage" is, that even in case of low bandwidth on
Packit Service 3880ab
average, bursts would be handled at the speed defined by RT criterion. Still,
Packit Service 3880ab
if extra speed is needed (e.g. due to latency), non linear service curves
Packit Service 3880ab
should be used in such case.
Packit Service 3880ab
Packit Service 3880ab
In the other words: the LS criterion is meaningless in the above example.
Packit Service 3880ab
Packit Service 3880ab
You can quickly "workaround" it by making sure each leaf class has RT service
Packit Service 3880ab
curve assigned (thus guaranteeing all of them will get some bandwidth), but it
Packit Service 3880ab
doesn't make it any more valid.
Packit Service 3880ab
Packit Service 3880ab
Keep in mind - if you use nonlinear curves and irregularities explained above
Packit Service 3880ab
happen \fIonly\fR in the first segment, then there's little wrong with
Packit Service 3880ab
"overusing" RT curve a bit:
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
A \- ls 5.0Mbit, rt 9Mbit/30ms, then 1Mbit
Packit Service 3880ab
B \- ls 2.5Mbit
Packit Service 3880ab
C \- ls 2.5Mbit
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
Here, the vt of A will "spike" in the initial period, but then A will never get more
Packit Service 3880ab
than 1Mbit until B & C catch up. Then everything will be back to normal.
Packit Service 3880ab
.
Packit Service 3880ab
.SH "LINUX AND TIMER RESOLUTION"
Packit Service 3880ab
.
Packit Service 3880ab
In certain situations, the scheduler can throttle itself and setup so
Packit Service 3880ab
called watchdog to wakeup dequeue function at some time later. In case of HFSC
Packit Service 3880ab
it happens when for example no packet is eligible for scheduling, and UL
Packit Service 3880ab
service curve is used to limit the speed at which LS criterion is allowed to
Packit Service 3880ab
dequeue packets. It's called throttling, and accuracy of it is dependent on
Packit Service 3880ab
how the kernel is compiled.
Packit Service 3880ab
Packit Service 3880ab
There're 3 important options in modern kernels, as far as timers' resolution
Packit Service 3880ab
goes: \&'tickless system', \&'high resolution timer support' and \&'timer
Packit Service 3880ab
frequency'.
Packit Service 3880ab
Packit Service 3880ab
If you have \&'tickless system' enabled, then the timer interrupt will trigger
Packit Service 3880ab
as slowly as possible, but each time a scheduler throttles itself (or any
Packit Service 3880ab
other part of the kernel needs better accuracy), the rate will be increased as
Packit Service 3880ab
needed / possible. The ceiling is either \&'timer frequency' if \&'high
Packit Service 3880ab
resolution timer support' is not available or not compiled in, or it's
Packit Service 3880ab
hardware dependent and can go \fIfar\fR beyond the highest \&'timer frequency'
Packit Service 3880ab
setting available.
Packit Service 3880ab
Packit Service 3880ab
If \&'tickless system' is not enabled, the timer will trigger at a fixed rate
Packit Service 3880ab
specified by \&'timer frequency' \- regardless if high resolution timers are
Packit Service 3880ab
or aren't available.
Packit Service 3880ab
Packit Service 3880ab
This is important to keep those settings in mind, as in scenario like: no
Packit Service 3880ab
tickless, no HR timers, frequency set to 100hz \- throttling accuracy would be
Packit Service 3880ab
at 10ms. It doesn't automatically mean you would be limited to ~0.8Mbit/s
Packit Service 3880ab
(assuming packets at ~1KB) \- as long as your queues are prepared to cover for
Packit Service 3880ab
timer inaccuracy. Of course, in case of e.g. locally generated UDP traffic \-
Packit Service 3880ab
appropriate socket size is needed as well. Short example to make it more
Packit Service 3880ab
understandable (assume hardcore anti\-schedule settings \- HZ=100, no HR
Packit Service 3880ab
timers, no tickless):
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
tc qdisc add dev eth0 root handle 1:0 hfsc default 1
Packit Service 3880ab
tc class add dev eth0 parent 1:0 classid 1:1 hfsc rt m2 10Mbit
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
Assuming packet of ~1KB size and HZ=100, that averages to ~0.8Mbit \- anything
Packit Service 3880ab
beyond it (e.g. the above example with specified rate over 10x larger) will
Packit Service 3880ab
require appropriate queuing and cause bursts every ~10 ms. As you can
Packit Service 3880ab
imagine, any HFSC's RT guarantees will be seriously invalidated by that.
Packit Service 3880ab
Aforementioned example is mainly important if you deal with old hardware \- as
Packit Service 3880ab
is particularly popular for home server chores. Even then, you can easily
Packit Service 3880ab
set HZ=1000 and have very accurate scheduling for typical adsl speeds.
Packit Service 3880ab
Packit Service 3880ab
Anything modern (apic or even hpet msi based timers + \&'tickless system')
Packit Service 3880ab
will provide enough accuracy for superb 1Gbit scheduling. For example, on one
Packit Service 3880ab
of my cheap dual-core AMD boards I have the following settings:
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
tc qdisc add dev eth0 parent root handle 1:0 hfsc default 1
Packit Service 3880ab
tc class add dev eth0 parent 1:0 classid 1:1 hfsc rt m2 300mbit
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
And a simple:
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
nc \-u dst.host.com 54321 
Packit Service 3880ab
nc \-l \-p 54321 >/dev/null
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
\&...will yield the following effects over a period of ~10 seconds (taken from
Packit Service 3880ab
/proc/interrupts):
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
319: 42124229   0  HPET_MSI\-edge  hpet2 (before)
Packit Service 3880ab
319: 42436214   0  HPET_MSI\-edge  hpet2 (after 10s.)
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
That's roughly 31000/s. Now compare it with HZ=1000 setting. The obvious
Packit Service 3880ab
drawback of it is that cpu load can be rather high with servicing that
Packit Service 3880ab
many timer interrupts. The example with 300Mbit RT service curve on 1Gbit link is
Packit Service 3880ab
particularly ugly, as it requires a lot of throttling with minuscule delays.
Packit Service 3880ab
Packit Service 3880ab
Also note that it's just an example showing the capabilities of current hardware.
Packit Service 3880ab
The above example (essentially a 300Mbit TBF emulator) is pointless on an internal
Packit Service 3880ab
interface to begin with: you will pretty much always want a regular LS service
Packit Service 3880ab
curve there, and in such a scenario HFSC simply doesn't throttle at all.
Packit Service 3880ab
Packit Service 3880ab
300Mbit RT service curve (selected columns from mpstat \-P ALL 1):
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
10:56:43 PM  CPU  %sys     %irq   %soft   %idle
Packit Service 3880ab
10:56:44 PM  all  20.10    6.53   34.67   37.19
Packit Service 3880ab
10:56:44 PM    0  35.00    0.00   63.00    0.00
Packit Service 3880ab
10:56:44 PM    1   4.95   12.87    6.93   73.27
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
So, in the rare case you need those speeds with only a RT service curve, or with a UL
Packit Service 3880ab
service curve: remember the drawbacks.
Packit Service 3880ab
.
Packit Service 3880ab
.SH "CAVEAT: RANDOM ONLINE EXAMPLES"
Packit Service 3880ab
.
Packit Service 3880ab
For reasons unknown (though well guessed), many examples you can google love to
Packit Service 3880ab
overuse UL criterion and stuff it in every node possible. This makes no sense
Packit Service 3880ab
and works against what HFSC tries to do (and does pretty damn well). Use UL
Packit Service 3880ab
where it makes sense: on the uppermost node to match upstream router's uplink
Packit Service 3880ab
capacity. Or in special cases, such as testing (limit certain subtree to some
Packit Service 3880ab
speed), or customers that must never get more than certain speed. In the last
Packit Service 3880ab
case you can usually achieve the same by just using a RT criterion without LS+UL
Packit Service 3880ab
on leaf nodes.
Packit Service 3880ab
Packit Service 3880ab
As for the router case - remember it's good to differentiate between "traffic to
Packit Service 3880ab
router" (remote console, web config, etc.) and "outgoing traffic", so for
Packit Service 3880ab
example:
Packit Service 3880ab
Packit Service 3880ab
.nf
Packit Service 3880ab
tc qdisc add dev eth0 root handle 1:0 hfsc default 0x8002
Packit Service 3880ab
tc class add dev eth0 parent 1:0 classid 1:999 hfsc rt m2 50Mbit
Packit Service 3880ab
tc class add dev eth0 parent 1:0 classid 1:1 hfsc ls m2 2Mbit ul m2 2Mbit
Packit Service 3880ab
.fi
Packit Service 3880ab
Packit Service 3880ab
\&... so "internet" tree under 1:1 and "router itself" as 1:999
Packit Service 3880ab
.
Packit Service 3880ab
.SH "LAYER2 ADAPTATION"
Packit Service 3880ab
.
Packit Service 3880ab
Please refer to \fBtc\-stab\fR(8)
Packit Service 3880ab
.
Packit Service 3880ab
.SH "SEE ALSO"
Packit Service 3880ab
.
Packit Service 3880ab
\fBtc\fR(8), \fBtc\-hfsc\fR(8), \fBtc\-stab\fR(8)
Packit Service 3880ab
Packit Service 3880ab
Please direct bugreports and patches to: <netdev@vger.kernel.org>
Packit Service 3880ab
.
Packit Service 3880ab
.SH "AUTHOR"
Packit Service 3880ab
.
Packit Service 3880ab
Manpage created by Michal Soltys (soltys@ziu.info)