|
Packit |
d3f73b |
.TH CBQ 8 "16 December 2001" "iproute2" "Linux"
|
|
Packit |
d3f73b |
.SH NAME
|
|
Packit |
d3f73b |
CBQ \- Class Based Queueing
|
|
Packit |
d3f73b |
.SH SYNOPSIS
|
|
Packit |
d3f73b |
.B tc qdisc ... dev
|
|
Packit |
d3f73b |
dev
|
|
Packit |
d3f73b |
.B ( parent
|
|
Packit |
d3f73b |
classid
|
|
Packit |
d3f73b |
.B | root) [ handle
|
|
Packit |
d3f73b |
major:
|
|
Packit |
d3f73b |
.B ] cbq [ allot
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B ] avpkt
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B bandwidth
|
|
Packit |
d3f73b |
rate
|
|
Packit |
d3f73b |
.B [ cell
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B ] [ ewma
|
|
Packit |
d3f73b |
log
|
|
Packit |
d3f73b |
.B ] [ mpu
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B ]
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.B tc class ... dev
|
|
Packit |
d3f73b |
dev
|
|
Packit |
d3f73b |
.B parent
|
|
Packit |
d3f73b |
major:[minor]
|
|
Packit |
d3f73b |
.B [ classid
|
|
Packit |
d3f73b |
major:minor
|
|
Packit |
d3f73b |
.B ] cbq allot
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B [ bandwidth
|
|
Packit |
d3f73b |
rate
|
|
Packit |
d3f73b |
.B ] [ rate
|
|
Packit |
d3f73b |
rate
|
|
Packit |
d3f73b |
.B ] prio
|
|
Packit |
d3f73b |
priority
|
|
Packit |
d3f73b |
.B [ weight
|
|
Packit |
d3f73b |
weight
|
|
Packit |
d3f73b |
.B ] [ minburst
|
|
Packit |
d3f73b |
packets
|
|
Packit |
d3f73b |
.B ] [ maxburst
|
|
Packit |
d3f73b |
packets
|
|
Packit |
d3f73b |
.B ] [ ewma
|
|
Packit |
d3f73b |
log
|
|
Packit |
d3f73b |
.B ] [ cell
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B ] avpkt
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B [ mpu
|
|
Packit |
d3f73b |
bytes
|
|
Packit |
d3f73b |
.B ] [ bounded isolated ] [ split
|
|
Packit |
d3f73b |
handle
|
|
Packit |
d3f73b |
.B & defmap
|
|
Packit |
d3f73b |
defmap
|
|
Packit |
d3f73b |
.B ] [ estimator
|
|
Packit |
d3f73b |
interval timeconstant
|
|
Packit |
d3f73b |
.B ]
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH DESCRIPTION
|
|
Packit |
d3f73b |
Class Based Queueing is a classful qdisc that implements a rich
|
|
Packit |
d3f73b |
linksharing hierarchy of classes. It contains shaping elements as
|
|
Packit |
d3f73b |
well as prioritizing capabilities. Shaping is performed using link
|
|
Packit |
d3f73b |
idle time calculations based on the timing of dequeue events and
|
|
Packit |
d3f73b |
underlying link bandwidth.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH SHAPING ALGORITHM
|
|
Packit |
d3f73b |
When shaping a 10mbit/s connection to 1mbit/s, the link will
|
|
Packit |
d3f73b |
be idle 90% of the time. If it isn't, it needs to be throttled so that it
|
|
Packit |
d3f73b |
IS idle 90% of the time.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
During operations, the effective idletime is measured using an
|
|
Packit |
d3f73b |
exponential weighted moving average (EWMA), which considers recent
|
|
Packit |
d3f73b |
packets to be exponentially more important than past ones. The Unix
|
|
Packit |
d3f73b |
loadaverage is calculated in the same way.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
The calculated idle time is subtracted from the EWMA measured one,
|
|
Packit |
d3f73b |
the resulting number is called 'avgidle'. A perfectly loaded link has
|
|
Packit |
d3f73b |
an avgidle of zero: packets arrive exactly at the calculated
|
|
Packit |
d3f73b |
interval.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
An overloaded link has a negative avgidle and if it gets too negative,
|
|
Packit |
d3f73b |
CBQ throttles and is then 'overlimit'.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
Conversely, an idle link might amass a huge avgidle, which would then
|
|
Packit |
d3f73b |
allow infinite bandwidths after a few hours of silence. To prevent
|
|
Packit |
d3f73b |
this, avgidle is capped at
|
|
Packit |
d3f73b |
.B maxidle.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
If overlimit, in theory, the CBQ could throttle itself for exactly the
|
|
Packit |
d3f73b |
amount of time that was calculated to pass between packets, and then
|
|
Packit |
d3f73b |
pass one packet, and throttle again. Due to timer resolution constraints,
|
|
Packit |
d3f73b |
this may not be feasible, see the
|
|
Packit |
d3f73b |
.B minburst
|
|
Packit |
d3f73b |
parameter below.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH CLASSIFICATION
|
|
Packit |
d3f73b |
Within the one CBQ instance many classes may exist. Each of these classes
|
|
Packit |
d3f73b |
contains another qdisc, by default
|
|
Packit |
d3f73b |
.BR tc-pfifo (8).
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
When enqueueing a packet, CBQ starts at the root and uses various methods to
|
|
Packit |
d3f73b |
determine which class should receive the data.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
In the absence of uncommon configuration options, the process is rather easy.
|
|
Packit |
d3f73b |
At each node we look for an instruction, and then go to the class the
|
|
Packit |
d3f73b |
instruction refers us to. If the class found is a barren leaf-node (without
|
|
Packit |
d3f73b |
children), we enqueue the packet there. If it is not yet a leaf node, we do
|
|
Packit |
d3f73b |
the whole thing over again starting from that node.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
The following actions are performed, in order at each node we visit, until one
|
|
Packit |
d3f73b |
sends us to another node, or terminates the process.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
(i)
|
|
Packit |
d3f73b |
Consult filters attached to the class. If sent to a leafnode, we are done.
|
|
Packit |
d3f73b |
Otherwise, restart.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
(ii)
|
|
Packit |
d3f73b |
Consult the defmap for the priority assigned to this packet, which depends
|
|
Packit |
d3f73b |
on the TOS bits. Check if the referral is leafless, otherwise restart.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
(iii)
|
|
Packit |
d3f73b |
Ask the defmap for instructions for the 'best effort' priority. Check the
|
|
Packit |
d3f73b |
answer for leafness, otherwise restart.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
(iv)
|
|
Packit |
d3f73b |
If none of the above returned with an instruction, enqueue at this node.
|
|
Packit |
d3f73b |
.P
|
|
Packit |
d3f73b |
This algorithm makes sure that a packet always ends up somewhere, even while
|
|
Packit |
d3f73b |
you are busy building your configuration.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
For more details, see
|
|
Packit |
d3f73b |
.BR tc-cbq-details(8).
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH LINK SHARING ALGORITHM
|
|
Packit |
d3f73b |
When dequeuing for sending to the network device, CBQ decides which of its
|
|
Packit |
d3f73b |
classes will be allowed to send. It does so with a Weighted Round Robin process
|
|
Packit |
d3f73b |
in which each class with packets gets a chance to send in turn. The WRR process
|
|
Packit |
d3f73b |
starts by asking the highest priority classes (lowest numerically -
|
|
Packit |
d3f73b |
highest semantically) for packets, and will continue to do so until they
|
|
Packit |
d3f73b |
have no more data to offer, in which case the process repeats for lower
|
|
Packit |
d3f73b |
priorities.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
Classes by default borrow bandwidth from their siblings. A class can be
|
|
Packit |
d3f73b |
prevented from doing so by declaring it 'bounded'. A class can also indicate
|
|
Packit |
d3f73b |
its unwillingness to lend out bandwidth by being 'isolated'.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH QDISC
|
|
Packit |
d3f73b |
The root of a CBQ qdisc class tree has the following parameters:
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
parent major:minor | root
|
|
Packit |
d3f73b |
This mandatory parameter determines the place of the CBQ instance, either at the
|
|
Packit |
d3f73b |
.B root
|
|
Packit |
d3f73b |
of an interface or within an existing class.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
handle major:
|
|
Packit |
d3f73b |
Like all other qdiscs, the CBQ can be assigned a handle. Should consist only
|
|
Packit |
d3f73b |
of a major number, followed by a colon. Optional, but very useful if classes
|
|
Packit |
d3f73b |
will be generated within this qdisc.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
allot bytes
|
|
Packit |
d3f73b |
This allotment is the 'chunkiness' of link sharing and is used for determining packet
|
|
Packit |
d3f73b |
transmission time tables. The qdisc allot differs slightly from the class allot discussed
|
|
Packit |
d3f73b |
below. Optional. Defaults to a reasonable value, related to avpkt.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
avpkt bytes
|
|
Packit |
d3f73b |
The average size of a packet is needed for calculating maxidle, and is also used
|
|
Packit |
d3f73b |
for making sure 'allot' has a safe value. Mandatory.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
bandwidth rate
|
|
Packit |
d3f73b |
To determine the idle time, CBQ must know the bandwidth of your underlying
|
|
Packit |
d3f73b |
physical interface, or parent qdisc. This is a vital parameter, more about it
|
|
Packit |
d3f73b |
later. Mandatory.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
cell
|
|
Packit |
d3f73b |
The cell size determines he granularity of packet transmission time calculations. Has a sensible default.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
mpu
|
|
Packit |
d3f73b |
A zero sized packet may still take time to transmit. This value is the lower
|
|
Packit |
d3f73b |
cap for packet transmission time calculations - packets smaller than this value
|
|
Packit |
d3f73b |
are still deemed to have this size. Defaults to zero.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
ewma log
|
|
Packit |
d3f73b |
When CBQ needs to measure the average idle time, it does so using an
|
|
Packit |
d3f73b |
Exponentially Weighted Moving Average which smooths out measurements into
|
|
Packit |
d3f73b |
a moving average. The EWMA LOG determines how much smoothing occurs. Lower
|
|
Packit |
d3f73b |
values imply greater sensitivity. Must be between 0 and 31. Defaults
|
|
Packit |
d3f73b |
to 5.
|
|
Packit |
d3f73b |
.P
|
|
Packit |
d3f73b |
A CBQ qdisc does not shape out of its own accord. It only needs to know certain
|
|
Packit |
d3f73b |
parameters about the underlying link. Actual shaping is done in classes.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH CLASSES
|
|
Packit |
d3f73b |
Classes have a host of parameters to configure their operation.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
parent major:minor
|
|
Packit |
d3f73b |
Place of this class within the hierarchy. If attached directly to a qdisc
|
|
Packit |
d3f73b |
and not to another class, minor can be omitted. Mandatory.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
classid major:minor
|
|
Packit |
d3f73b |
Like qdiscs, classes can be named. The major number must be equal to the
|
|
Packit |
d3f73b |
major number of the qdisc to which it belongs. Optional, but needed if this
|
|
Packit |
d3f73b |
class is going to have children.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
weight weight
|
|
Packit |
d3f73b |
When dequeuing to the interface, classes are tried for traffic in a
|
|
Packit |
d3f73b |
round-robin fashion. Classes with a higher configured qdisc will generally
|
|
Packit |
d3f73b |
have more traffic to offer during each round, so it makes sense to allow
|
|
Packit |
d3f73b |
it to dequeue more traffic. All weights under a class are normalized, so
|
|
Packit |
d3f73b |
only the ratios matter. Defaults to the configured rate, unless the priority
|
|
Packit |
d3f73b |
of this class is maximal, in which case it is set to 1.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
allot bytes
|
|
Packit |
d3f73b |
Allot specifies how many bytes a qdisc can dequeue
|
|
Packit |
d3f73b |
during each round of the process. This parameter is weighted using the
|
|
Packit |
d3f73b |
renormalized class weight described above. Silently capped at a minimum of
|
|
Packit |
d3f73b |
3/2 avpkt. Mandatory.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
prio priority
|
|
Packit |
d3f73b |
In the round-robin process, classes with the lowest priority field are tried
|
|
Packit |
d3f73b |
for packets first. Mandatory.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
avpkt
|
|
Packit |
d3f73b |
See the QDISC section.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
rate rate
|
|
Packit |
d3f73b |
Maximum rate this class and all its children combined can send at. Mandatory.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
bandwidth rate
|
|
Packit |
d3f73b |
This is different from the bandwidth specified when creating a CBQ disc! Only
|
|
Packit |
d3f73b |
used to determine maxidle and offtime, which are only calculated when
|
|
Packit |
d3f73b |
specifying maxburst or minburst. Mandatory if specifying maxburst or minburst.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
maxburst
|
|
Packit |
d3f73b |
This number of packets is used to calculate maxidle so that when
|
|
Packit |
d3f73b |
avgidle is at maxidle, this number of average packets can be burst
|
|
Packit |
d3f73b |
before avgidle drops to 0. Set it higher to be more tolerant of
|
|
Packit |
d3f73b |
bursts. You can't set maxidle directly, only via this parameter.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
minburst
|
|
Packit |
d3f73b |
As mentioned before, CBQ needs to throttle in case of
|
|
Packit |
d3f73b |
overlimit. The ideal solution is to do so for exactly the calculated
|
|
Packit |
d3f73b |
idle time, and pass 1 packet. However, Unix kernels generally have a
|
|
Packit |
d3f73b |
hard time scheduling events shorter than 10ms, so it is better to
|
|
Packit |
d3f73b |
throttle for a longer period, and then pass minburst packets in one
|
|
Packit |
d3f73b |
go, and then sleep minburst times longer.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
The time to wait is called the offtime. Higher values of minburst lead
|
|
Packit |
d3f73b |
to more accurate shaping in the long term, but to bigger bursts at
|
|
Packit |
d3f73b |
millisecond timescales. Optional.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
minidle
|
|
Packit |
d3f73b |
If avgidle is below 0, we are overlimits and need to wait until
|
|
Packit |
d3f73b |
avgidle will be big enough to send one packet. To prevent a sudden
|
|
Packit |
d3f73b |
burst from shutting down the link for a prolonged period of time,
|
|
Packit |
d3f73b |
avgidle is reset to minidle if it gets too low.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
Minidle is specified in negative microseconds, so 10 means that
|
|
Packit |
d3f73b |
avgidle is capped at -10us. Optional.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
bounded
|
|
Packit |
d3f73b |
Signifies that this class will not borrow bandwidth from its siblings.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
isolated
|
|
Packit |
d3f73b |
Means that this class will not borrow bandwidth to its siblings
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
split major:minor & defmap bitmap[/bitmap]
|
|
Packit |
d3f73b |
If consulting filters attached to a class did not give a verdict,
|
|
Packit |
d3f73b |
CBQ can also classify based on the packet's priority. There are 16
|
|
Packit |
d3f73b |
priorities available, numbered from 0 to 15.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
The defmap specifies which priorities this class wants to receive,
|
|
Packit |
d3f73b |
specified as a bitmap. The Least Significant Bit corresponds to priority
|
|
Packit |
d3f73b |
zero. The
|
|
Packit |
d3f73b |
.B split
|
|
Packit |
d3f73b |
parameter tells CBQ at which class the decision must be made, which should
|
|
Packit |
d3f73b |
be a (grand)parent of the class you are adding.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
As an example, 'tc class add ... classid 10:1 cbq .. split 10:0 defmap c0'
|
|
Packit |
d3f73b |
configures class 10:0 to send packets with priorities 6 and 7 to 10:1.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
The complimentary configuration would then
|
|
Packit |
d3f73b |
be: 'tc class add ... classid 10:2 cbq ... split 10:0 defmap 3f'
|
|
Packit |
d3f73b |
Which would send all packets 0, 1, 2, 3, 4 and 5 to 10:1.
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
estimator interval timeconstant
|
|
Packit |
d3f73b |
CBQ can measure how much bandwidth each class is using, which tc filters
|
|
Packit |
d3f73b |
can use to classify packets with. In order to determine the bandwidth
|
|
Packit |
d3f73b |
it uses a very simple estimator that measures once every
|
|
Packit |
d3f73b |
.B interval
|
|
Packit |
d3f73b |
microseconds how much traffic has passed. This again is a EWMA, for which
|
|
Packit |
d3f73b |
the time constant can be specified, also in microseconds. The
|
|
Packit |
d3f73b |
.B time constant
|
|
Packit |
d3f73b |
corresponds to the sluggishness of the measurement or, conversely, to the
|
|
Packit |
d3f73b |
sensitivity of the average to short bursts. Higher values mean less
|
|
Packit |
d3f73b |
sensitivity.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH BUGS
|
|
Packit |
d3f73b |
The actual bandwidth of the underlying link may not be known, for example
|
|
Packit |
d3f73b |
in the case of PPoE or PPTP connections which in fact may send over a
|
|
Packit |
d3f73b |
pipe, instead of over a physical device. CBQ is quite resilient to major
|
|
Packit |
d3f73b |
errors in the configured bandwidth, probably a the cost of coarser shaping.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
Default kernels rely on coarse timing information for making decisions. These
|
|
Packit |
d3f73b |
may make shaping precise in the long term, but inaccurate on second long scales.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
See
|
|
Packit |
d3f73b |
.BR tc-cbq-details(8)
|
|
Packit |
d3f73b |
for hints on how to improve this.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH SOURCES
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
o
|
|
Packit |
d3f73b |
Sally Floyd and Van Jacobson, "Link-sharing and Resource
|
|
Packit |
d3f73b |
Management Models for Packet Networks",
|
|
Packit |
d3f73b |
IEEE/ACM Transactions on Networking, Vol.3, No.4, 1995
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
o
|
|
Packit |
d3f73b |
Sally Floyd, "Notes on CBQ and Guaranteed Service", 1995
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
o
|
|
Packit |
d3f73b |
Sally Floyd, "Notes on Class-Based Queueing: Setting
|
|
Packit |
d3f73b |
Parameters", 1996
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.TP
|
|
Packit |
d3f73b |
o
|
|
Packit |
d3f73b |
Sally Floyd and Michael Speer, "Experimental Results
|
|
Packit |
d3f73b |
for Class-Based Queueing", 1998, not published.
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH SEE ALSO
|
|
Packit |
d3f73b |
.BR tc (8)
|
|
Packit |
d3f73b |
|
|
Packit |
d3f73b |
.SH AUTHOR
|
|
Packit |
d3f73b |
Alexey N. Kuznetsov, <kuznet@ms2.inr.ac.ru>. This manpage maintained by
|
|
Packit |
d3f73b |
bert hubert <ahu@ds9a.nl>
|