[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230220161809.t2vj6daixio7uzbw@skbuf>
Date: Mon, 20 Feb 2023 18:18:09 +0200
From: Vladimir Oltean <vladimir.oltean@....com>
To: Péter Antal <peti.antal99@...il.com>
Cc: netdev@...r.kernel.org, John Fastabend <john.fastabend@...il.com>,
Stephen Hemminger <stephen@...workplumber.org>,
Ferenc Fejes <fejes@....elte.hu>,
Ferenc Fejes <ferenc.fejes@...csson.com>,
Vinicius Costa Gomes <vinicius.gomes@...el.com>,
Péter Antal <antal.peti99@...il.com>
Subject: Re: [PATCH iproute2] man: tc-mqprio: extend prio-tc-queue mapping
with examples
On Mon, Feb 20, 2023 at 05:01:57PM +0100, Péter Antal wrote:
> Hi Vladimir,
>
> Vladimir Oltean <vladimir.oltean@....com> ezt írta (időpont: 2023.
> febr. 20., H, 16:29):
> >
> > Hi Péter,
> >
> > On Mon, Feb 20, 2023 at 04:05:48PM +0100, Péter Antal wrote:
> > > The current mqprio manual is not detailed about queue mapping
> > > and priorities, this patch adds some examples to it.
> > >
> > > Suggested-by: Ferenc Fejes <fejes@....elte.hu>
> > > Signed-off-by: Péter Antal <peti.antal99@...il.com>
> > > ---
> >
> > I think it's great that you are doing this. However, with all due respect,
> > this conflicts with the man page restructuring I am already doing for the
> > frame preemption work. Do you mind if I fix up some things and I pick your
> > patch up, and submit it as part of my series? I have some comments below.
>
> That's all right, thank you for doing this, just please carry my
> signoff as co-developer if possible.
Absolutely, this is implied.
> I agree with most of your suggestions.
I've applied your changes on top of mine. Can you and Ferenc please
review the end result? I'll take a small break of a couple of hours,
and continue working on this when I come back.
$ cat man/man8/tc-mqprio.8
.TH MQPRIO 8 "24 Sept 2013" "iproute2" "Linux"
.SH NAME
MQPRIO \- Multiqueue Priority Qdisc (Offloaded Hardware QOS)
.SH SYNOPSIS
.B tc qdisc ... dev
dev (
.B parent
classid | root) [
.B handle
major: ]
.B mqprio
.ti +8
[
.B num_tc
tcs ] [
.B map
P0 P1 P2... ] [
.B queues
count1@...set1 count2@...set2 ... ]
.ti +8
[
.B hw
1|0 ] [
.B mode
dcb|channel ] [
.B shaper
dcb|bw_rlimit ]
.ti +8
[
.B min_rate
min_rate1 min_rate2 ... ] [
.B max_rate
max_rate1 max_rate2 ... ]
.ti +8
[
.B fp
FP0 FP1 FP2 ... ]
.SH DESCRIPTION
The MQPRIO qdisc is a simple queuing discipline that allows mapping
traffic flows to hardware queue ranges using priorities and a configurable
priority to traffic class mapping. A traffic class in this context is
a set of contiguous qdisc classes which map 1:1 to a set of hardware
exposed queues.
By default the qdisc allocates a pfifo qdisc (packet limited first in, first
out queue) per TX queue exposed by the lower layer device. Other queuing
disciplines may be added subsequently. Packets are enqueued using the
.B map
parameter and hashed across the indicated queues in the
.B offset
and
.B count.
By default these parameters are configured by the hardware
driver to match the hardware QOS structures.
.B Channel
mode supports full offload of the mqprio options, the traffic classes, the queue
configurations and QOS attributes to the hardware. Enabled hardware can provide
hardware QOS with the ability to steer traffic flows to designated traffic
classes provided by this qdisc. Hardware based QOS is configured using the
.B shaper
parameter.
.B bw_rlimit
with minimum and maximum bandwidth rates can be used for setting
transmission rates on each traffic class. Also further qdiscs may be added
to the classes of MQPRIO to create more complex configurations.
.SH ALGORITHM
On creation with 'tc qdisc add', eight traffic classes are created mapping
priorities 0..7 to traffic classes 0..7 and priorities greater than 7 to
traffic class 0. This requires base driver support and the creation will
fail on devices that do not support hardware QOS schemes.
These defaults can be overridden using the qdisc parameters. Providing
the 'hw 0' flag allows software to run without hardware coordination.
If hardware coordination is being used and arguments are provided that
the hardware can not support then an error is returned. For many users
hardware defaults should work reasonably well.
As one specific example numerous Ethernet cards support the 802.1Q
link strict priority transmission selection algorithm (TSA). MQPRIO
enabled hardware in conjunction with the classification methods below
can provide hardware offloaded support for this TSA.
.SH CLASSIFICATION
Multiple methods are available to set the SKB priority which MQPRIO
uses to select which traffic class to enqueue the packet.
.TP
>From user space
A process with sufficient privileges can encode the destination class
directly with SO_PRIORITY, see
.BR socket(7).
.TP
with iptables/nftables
An iptables/nftables rule can be created to match traffic flows and
set the priority.
.BR iptables(8)
.TP
with net_prio cgroups
The net_prio cgroup can be used to set the priority of all sockets
belong to an application. See kernel and cgroup documentation for details.
.SH QDISC PARAMETERS
.TP
num_tc
Number of traffic classes to use. Up to 16 classes supported.
There cannot be more traffic classes than TX queues.
.TP
map
The priority to traffic class map. Maps priorities 0..15 to a specified
traffic class. The default value for this argument is
┌────┬────┐
│Prio│ tc │
├────┼────┤
│ 0 │ 0 │
│ 1 │ 1 │
│ 2 │ 2 │
│ 3 │ 3 │
│ 4 │ 4 │
│ 5 │ 5 │
│ 6 │ 6 │
│ 7 │ 7 │
│ 8 │ 0 │
│ 9 │ 1 │
│ 10 │ 1 │
│ 11 │ 1 │
│ 12 │ 3 │
│ 13 │ 3 │
│ 14 │ 3 │
│ 15 │ 3 │
└────┴────┘
.TP
queues
Provide count and offset of queue range for each traffic class. In the
format,
.B count@...set.
Without hardware coordination, queue ranges for each traffic classes cannot
overlap and must be a contiguous range of queues. With hardware coordination,
the device driver may apply a different queue configuration than requested,
and the requested queue configuration may overlap (but the one which is applied
may not). The default value for this argument is:
┌────┬───────┬────────┐
│ tc │ count │ offset │
├────┼───────┼────────┤
│ 0 │ 0 │ 0 │
│ 1 │ 0 │ 0 │
│ 2 │ 0 │ 0 │
│ 3 │ 0 │ 0 │
│ 4 │ 0 │ 0 │
│ 5 │ 0 │ 0 │
│ 6 │ 0 │ 0 │
│ 7 │ 0 │ 0 │
│ 8 │ 0 │ 0 │
│ 9 │ 0 │ 0 │
│ 10 │ 0 │ 0 │
│ 11 │ 0 │ 0 │
│ 12 │ 0 │ 0 │
│ 13 │ 0 │ 0 │
│ 14 │ 0 │ 0 │
│ 15 │ 0 │ 0 │
└────┴───────┴────────┘
.TP
hw
Set to
.B 1
to support hardware offload. Set to
.B 0
to configure user specified values in software only.
Without hardware coordination, the device driver is not notified of the number
of traffic classes and their mapping to TXQs. The device is not expected to
prioritize between traffic classes without hardware coordination.
The default value of this parameter is
.B 1.
.TP
mode
Set to
.B channel
for full use of the mqprio options. Use
.B dcb
to offload only TC values and use hardware QOS defaults. Supported with 'hw'
set to 1 only.
.TP
shaper
Use
.B bw_rlimit
to set bandwidth rate limits for a traffic class. Use
.B dcb
for hardware QOS defaults. Supported with 'hw' set to 1 only.
.TP
min_rate
Minimum value of bandwidth rate limit for a traffic class. Supported only when
the
.B 'shaper'
argument is set to
.B 'bw_rlimit'.
.TP
max_rate
Maximum value of bandwidth rate limit for a traffic class. Supported only when
the
.B 'shaper'
argument is set to
.B 'bw_rlimit'.
.TP
fp
Selects whether traffic classes are express (deliver packets via the eMAC) or
preemptible (deliver packets via the pMAC), according to IEEE 802.1Q-2018
clause 6.7.2 Frame preemption. Takes the form of an array (one element per
traffic class) with values being
.B 'E'
(for express) or
.B 'P'
(for preemptible).
Multiple priorities which map to the same traffic class, as well as multiple
TXQs which map to the same traffic class, must have the same FP attributes.
To interpret the FP as an attribute per priority, the
.B 'map'
argument can be used for translation. To interpret FP as an attribute per TXQ,
the
.B 'queues'
argument can be used for translation.
Traffic classes are express by default. The argument is supported only with
.B 'hw'
set to 1. Preemptible traffic classes are accepted only if the device has a MAC
Merge layer configurable through
.BR ethtool(8).
.SH SEE ALSO
.BR ethtool(8)
.SH EXAMPLE
The following example shows how to attach priorities to 4 traffic classes
('num_tc 4'), and how to pair these traffic classes with 4 hardware queues,
with hardware coordination ('hw 1'), according to the following configuration.
┌────┬────┬───────┐
│Prio│ tc │ queue │
├────┼────┼───────┤
│ 0 │ 0 │ 0 │
│ 1 │ 0 │ 0 │
│ 2 │ 0 │ 0 │
│ 3 │ 0 │ 0 │
│ 4 │ 1 │ 1 │
│ 5 │ 1 │ 1 │
│ 6 │ 1 │ 1 │
│ 7 │ 1 │ 1 │
│ 8 │ 2 │ 2 │
│ 9 │ 2 │ 2 │
│ 10 │ 2 │ 2 │
│ 11 │ 2 │ 2 │
│ 12 │ 3 │ 3 │
│ 13 │ 3 │ 3 │
│ 14 │ 3 │ 3 │
│ 15 │ 3 │ 3 │
└────┴────┴───────┘
Traffic class 0 (TC0) is mapped to hardware queue 0 (TXQ0), TC1 is mapped to
TXQ1, TC2 is mapped to TXQ2, and TC3 to TXQ3.
.EX
# tc qdisc add dev eth0 root mqprio \\
num_tc 4 \\
map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 \\
queues 1@0 1@1 1@2 1@3 \\
hw 1
.EE
The following example shows how to attach priorities to 3 traffic classes
('num_tc 3'), and how to pair these traffic classes with 4 queues, without
hardware coordination ('hw 0'), according to the following configuration:
┌────┬────┬────────┐
│Prio│ tc │ queue │
├────┼────┼────────┤
│ 0 │ 0 │ 0 │
│ 1 │ 0 │ 0 │
│ 2 │ 0 │ 0 │
│ 3 │ 0 │ 0 │
│ 4 │ 1 │ 1 │
│ 5 │ 1 │ 1 │
│ 6 │ 1 │ 1 │
│ 7 │ 1 │ 1 │
│ 8 │ 2 │ 2 or 3 │
│ 9 │ 2 │ 2 or 3 │
│ 10 │ 2 │ 2 or 3 │
│ 11 │ 2 │ 2 or 3 │
│ 12 │ 2 │ 2 or 3 │
│ 13 │ 2 │ 2 or 3 │
│ 14 │ 2 │ 2 or 3 │
│ 15 │ 2 │ 2 or 3 │
└────┴────┴────────┘
TC0 is mapped to hardware TXQ0, TC1 to TXQ1, and TC2 is mapped to TXQ2 and
TXQ3, where the queue selection between these two queues is arbitrary.
.EX
# tc qdisc add dev eth0 root mqprio \\
num_tc 3 \\
map 0 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 \\
queues 1@0 1@1 2@2 \\
hw 0
.EE
In the following example, there are 8 hardware queues mapped to 5 traffic
classes according to the configuration below:
┌───────┐
tc0────┤Queue 0│◄────1@0
├───────┤
┌─┤Queue 1│◄────2@1
tc1──┤ ├───────┤
└─┤Queue 2│
├───────┤
tc2────┤Queue 3│◄────1@3
├───────┤
tc3────┤Queue 4│◄────1@4
├───────┤
┌─┤Queue 5│◄────3@5
│ ├───────┤
tc4──┼─┤Queue 6│
│ ├───────┤
└─┤Queue 7│
└───────┘
.EX
# tc qdisc add dev eth0 root mqprio \\
num_tc 5 \\
map 0 0 0 1 1 1 1 2 2 3 3 4 4 4 4 4 \\
queues 1@0 2@1 1@3 1@4 3@5
.EE
.SH AUTHORS
John Fastabend, <john.r.fastabend@...el.com>
Powered by blists - more mailing lists