[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f1053a1c3baa7de5d0ca246981b66076bf7102f3.camel@inf.elte.hu>
Date: Mon, 20 Feb 2023 17:28:10 +0100
From: Ferenc Fejes <fejes@....elte.hu>
To: Vladimir Oltean <vladimir.oltean@....com>,
Péter Antal <peti.antal99@...il.com>
Cc: netdev@...r.kernel.org, John Fastabend <john.fastabend@...il.com>,
Stephen Hemminger <stephen@...workplumber.org>,
Vinicius Costa Gomes <vinicius.gomes@...el.com>,
Péter Antal <antal.peti99@...il.com>
Subject: Re: [PATCH iproute2] man: tc-mqprio: extend prio-tc-queue mapping
with examples
Hi Vladimir!
On Mon, 2023-02-20 at 18:18 +0200, Vladimir Oltean wrote:
> On Mon, Feb 20, 2023 at 05:01:57PM +0100, Péter Antal wrote:
> > Hi Vladimir,
> >
> > Vladimir Oltean <vladimir.oltean@....com> ezt írta (időpont: 2023.
> > febr. 20., H, 16:29):
> > >
> > > Hi Péter,
> > >
> > > On Mon, Feb 20, 2023 at 04:05:48PM +0100, Péter Antal wrote:
> > > > The current mqprio manual is not detailed about queue mapping
> > > > and priorities, this patch adds some examples to it.
> > > >
> > > > Suggested-by: Ferenc Fejes <fejes@....elte.hu>
> > > > Signed-off-by: Péter Antal <peti.antal99@...il.com>
> > > > ---
> > >
> > > I think it's great that you are doing this. However, with all due
> > > respect,
> > > this conflicts with the man page restructuring I am already doing
> > > for the
> > > frame preemption work. Do you mind if I fix up some things and I
> > > pick your
> > > patch up, and submit it as part of my series? I have some
> > > comments below.
> >
> > That's all right, thank you for doing this, just please carry my
> > signoff as co-developer if possible.
>
> Absolutely, this is implied.
>
> > I agree with most of your suggestions.
>
> I've applied your changes on top of mine. Can you and Ferenc please
> review the end result? I'll take a small break of a couple of hours,
> and continue working on this when I come back.
>
> $ cat man/man8/tc-mqprio.8
> .TH MQPRIO 8 "24 Sept 2013" "iproute2" "Linux"
> .SH NAME
> MQPRIO \- Multiqueue Priority Qdisc (Offloaded Hardware QOS)
> .SH SYNOPSIS
> .B tc qdisc ... dev
> dev (
> .B parent
> classid | root) [
> .B handle
> major: ]
> .B mqprio
> .ti +8
> [
> .B num_tc
> tcs ] [
> .B map
> P0 P1 P2... ] [
> .B queues
> count1@...set1 count2@...set2 ... ]
> .ti +8
> [
> .B hw
> 1|0 ] [
> .B mode
> dcb|channel ] [
> .B shaper
> dcb|bw_rlimit ]
> .ti +8
> [
> .B min_rate
> min_rate1 min_rate2 ... ] [
> .B max_rate
> max_rate1 max_rate2 ... ]
> .ti +8
> [
> .B fp
> FP0 FP1 FP2 ... ]
>
> .SH DESCRIPTION
> The MQPRIO qdisc is a simple queuing discipline that allows mapping
> traffic flows to hardware queue ranges using priorities and a
> configurable
> priority to traffic class mapping. A traffic class in this context is
> a set of contiguous qdisc classes which map 1:1 to a set of hardware
> exposed queues.
>
> By default the qdisc allocates a pfifo qdisc (packet limited first
> in, first
> out queue) per TX queue exposed by the lower layer device. Other
> queuing
> disciplines may be added subsequently. Packets are enqueued using the
> .B map
> parameter and hashed across the indicated queues in the
> .B offset
> and
> .B count.
> By default these parameters are configured by the hardware
> driver to match the hardware QOS structures.
>
> .B Channel
> mode supports full offload of the mqprio options, the traffic
> classes, the queue
> configurations and QOS attributes to the hardware. Enabled hardware
> can provide
> hardware QOS with the ability to steer traffic flows to designated
> traffic
> classes provided by this qdisc. Hardware based QOS is configured
> using the
> .B shaper
> parameter.
> .B bw_rlimit
> with minimum and maximum bandwidth rates can be used for setting
> transmission rates on each traffic class. Also further qdiscs may be
> added
> to the classes of MQPRIO to create more complex configurations.
>
> .SH ALGORITHM
> On creation with 'tc qdisc add', eight traffic classes are created
> mapping
> priorities 0..7 to traffic classes 0..7 and priorities greater than 7
> to
> traffic class 0. This requires base driver support and the creation
> will
> fail on devices that do not support hardware QOS schemes.
>
> These defaults can be overridden using the qdisc parameters.
> Providing
> the 'hw 0' flag allows software to run without hardware coordination.
>
> If hardware coordination is being used and arguments are provided
> that
> the hardware can not support then an error is returned. For many
> users
> hardware defaults should work reasonably well.
>
> As one specific example numerous Ethernet cards support the 802.1Q
> link strict priority transmission selection algorithm (TSA). MQPRIO
> enabled hardware in conjunction with the classification methods below
> can provide hardware offloaded support for this TSA.
>
> .SH CLASSIFICATION
> Multiple methods are available to set the SKB priority which MQPRIO
> uses to select which traffic class to enqueue the packet.
> .TP
> From user space
> A process with sufficient privileges can encode the destination class
> directly with SO_PRIORITY, see
> .BR socket(7).
> .TP
> with iptables/nftables
> An iptables/nftables rule can be created to match traffic flows and
> set the priority.
> .BR iptables(8)
> .TP
> with net_prio cgroups
> The net_prio cgroup can be used to set the priority of all sockets
> belong to an application. See kernel and cgroup documentation for
> details.
>
> .SH QDISC PARAMETERS
> .TP
> num_tc
> Number of traffic classes to use. Up to 16 classes supported.
> There cannot be more traffic classes than TX queues.
>
> .TP
> map
> The priority to traffic class map. Maps priorities 0..15 to a
> specified
> traffic class. The default value for this argument is
>
> ┌────┬────┐
> │Prio│ tc │
> ├────┼────┤
> │ 0 │ 0 │
> │ 1 │ 1 │
> │ 2 │ 2 │
> │ 3 │ 3 │
> │ 4 │ 4 │
> │ 5 │ 5 │
> │ 6 │ 6 │
> │ 7 │ 7 │
> │ 8 │ 0 │
> │ 9 │ 1 │
> │ 10 │ 1 │
> │ 11 │ 1 │
> │ 12 │ 3 │
> │ 13 │ 3 │
> │ 14 │ 3 │
> │ 15 │ 3 │
> └────┴────┘
>
> .TP
> queues
> Provide count and offset of queue range for each traffic class. In
> the
> format,
> .B count@...set.
> Without hardware coordination, queue ranges for each traffic classes
> cannot
> overlap and must be a contiguous range of queues. With hardware
> coordination,
> the device driver may apply a different queue configuration than
> requested,
> and the requested queue configuration may overlap (but the one which
> is applied
> may not). The default value for this argument is:
>
> ┌────┬───────┬────────┐
> │ tc │ count │ offset │
> ├────┼───────┼────────┤
> │ 0 │ 0 │ 0 │
> │ 1 │ 0 │ 0 │
> │ 2 │ 0 │ 0 │
> │ 3 │ 0 │ 0 │
> │ 4 │ 0 │ 0 │
> │ 5 │ 0 │ 0 │
> │ 6 │ 0 │ 0 │
> │ 7 │ 0 │ 0 │
> │ 8 │ 0 │ 0 │
> │ 9 │ 0 │ 0 │
> │ 10 │ 0 │ 0 │
> │ 11 │ 0 │ 0 │
> │ 12 │ 0 │ 0 │
> │ 13 │ 0 │ 0 │
> │ 14 │ 0 │ 0 │
> │ 15 │ 0 │ 0 │
> └────┴───────┴────────┘
>
> .TP
> hw
> Set to
> .B 1
> to support hardware offload. Set to
> .B 0
> to configure user specified values in software only.
> Without hardware coordination, the device driver is not notified of
> the number
> of traffic classes and their mapping to TXQs. The device is not
> expected to
> prioritize between traffic classes without hardware coordination.
> The default value of this parameter is
> .B 1.
>
> .TP
> mode
> Set to
> .B channel
> for full use of the mqprio options. Use
> .B dcb
> to offload only TC values and use hardware QOS defaults. Supported
> with 'hw'
> set to 1 only.
>
> .TP
> shaper
> Use
> .B bw_rlimit
> to set bandwidth rate limits for a traffic class. Use
> .B dcb
> for hardware QOS defaults. Supported with 'hw' set to 1 only.
>
> .TP
> min_rate
> Minimum value of bandwidth rate limit for a traffic class. Supported
> only when
> the
> .B 'shaper'
> argument is set to
> .B 'bw_rlimit'.
>
> .TP
> max_rate
> Maximum value of bandwidth rate limit for a traffic class. Supported
> only when
> the
> .B 'shaper'
> argument is set to
> .B 'bw_rlimit'.
>
> .TP
> fp
> Selects whether traffic classes are express (deliver packets via the
> eMAC) or
> preemptible (deliver packets via the pMAC), according to IEEE 802.1Q-
> 2018
> clause 6.7.2 Frame preemption. Takes the form of an array (one
> element per
> traffic class) with values being
> .B 'E'
> (for express) or
> .B 'P'
> (for preemptible).
>
> Multiple priorities which map to the same traffic class, as well as
> multiple
> TXQs which map to the same traffic class, must have the same FP
> attributes.
> To interpret the FP as an attribute per priority, the
> .B 'map'
> argument can be used for translation. To interpret FP as an attribute
> per TXQ,
> the
> .B 'queues'
> argument can be used for translation.
>
> Traffic classes are express by default. The argument is supported
> only with
> .B 'hw'
> set to 1. Preemptible traffic classes are accepted only if the device
> has a MAC
> Merge layer configurable through
> .BR ethtool(8).
>
> .SH SEE ALSO
> .BR ethtool(8)
>
> .SH EXAMPLE
>
> The following example shows how to attach priorities to 4 traffic
> classes
> ('num_tc 4'), and how to pair these traffic classes with 4 hardware
> queues,
> with hardware coordination ('hw 1'), according to the following
> configuration.
>
> ┌────┬────┬───────┐
> │Prio│ tc │ queue │
> ├────┼────┼───────┤
> │ 0 │ 0 │ 0 │
> │ 1 │ 0 │ 0 │
> │ 2 │ 0 │ 0 │
> │ 3 │ 0 │ 0 │
> │ 4 │ 1 │ 1 │
> │ 5 │ 1 │ 1 │
> │ 6 │ 1 │ 1 │
> │ 7 │ 1 │ 1 │
> │ 8 │ 2 │ 2 │
> │ 9 │ 2 │ 2 │
> │ 10 │ 2 │ 2 │
> │ 11 │ 2 │ 2 │
> │ 12 │ 3 │ 3 │
> │ 13 │ 3 │ 3 │
> │ 14 │ 3 │ 3 │
> │ 15 │ 3 │ 3 │
> └────┴────┴───────┘
>
> Traffic class 0 (TC0) is mapped to hardware queue 0 (TXQ0), TC1 is
> mapped to
> TXQ1, TC2 is mapped to TXQ2, and TC3 to TXQ3.
>
> .EX
> # tc qdisc add dev eth0 root mqprio \\
> num_tc 4 \\
> map 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 \\
> queues 1@0 1@1 1@2 1@3 \\
> hw 1
> .EE
>
> The following example shows how to attach priorities to 3 traffic
> classes
> ('num_tc 3'), and how to pair these traffic classes with 4 queues,
> without
> hardware coordination ('hw 0'), according to the following
> configuration:
>
> ┌────┬────┬────────┐
> │Prio│ tc │ queue │
> ├────┼────┼────────┤
> │ 0 │ 0 │ 0 │
> │ 1 │ 0 │ 0 │
> │ 2 │ 0 │ 0 │
> │ 3 │ 0 │ 0 │
> │ 4 │ 1 │ 1 │
> │ 5 │ 1 │ 1 │
> │ 6 │ 1 │ 1 │
> │ 7 │ 1 │ 1 │
> │ 8 │ 2 │ 2 or 3 │
> │ 9 │ 2 │ 2 or 3 │
> │ 10 │ 2 │ 2 or 3 │
> │ 11 │ 2 │ 2 or 3 │
> │ 12 │ 2 │ 2 or 3 │
> │ 13 │ 2 │ 2 or 3 │
> │ 14 │ 2 │ 2 or 3 │
> │ 15 │ 2 │ 2 or 3 │
> └────┴────┴────────┘
>
> TC0 is mapped to hardware TXQ0, TC1 to TXQ1, and TC2 is mapped to
> TXQ2 and
> TXQ3, where the queue selection between these two queues is
> arbitrary.
>
> .EX
> # tc qdisc add dev eth0 root mqprio \\
> num_tc 3 \\
> map 0 0 0 0 1 1 1 1 2 2 2 2 2 2 2 2 \\
> queues 1@0 1@1 2@2 \\
> hw 0
> .EE
>
> In the following example, there are 8 hardware queues mapped to 5
> traffic
> classes according to the configuration below:
>
> ┌───────┐
> tc0────┤Queue 0│◄────1@0
> ├───────┤
> ┌─┤Queue 1│◄────2@1
> tc1──┤ ├───────┤
> └─┤Queue 2│
> ├───────┤
> tc2────┤Queue 3│◄────1@3
> ├───────┤
> tc3────┤Queue 4│◄────1@4
> ├───────┤
> ┌─┤Queue 5│◄────3@5
> │ ├───────┤
> tc4──┼─┤Queue 6│
> │ ├───────┤
> └─┤Queue 7│
> └───────┘
>
> .EX
> # tc qdisc add dev eth0 root mqprio \\
> num_tc 5 \\
> map 0 0 0 1 1 1 1 2 2 3 3 4 4 4 4 4 \\
> queues 1@0 2@1 1@3 1@4 3@5
> .EE
>
>
> .SH AUTHORS
> John Fastabend, <john.r.fastabend@...el.com>
LGTM!
Acked-by: Ferenc Fejes <fejes@....elte.hu>
Thank you!
Ferenc
Powered by blists - more mailing lists