lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 19 Aug 2022 16:12:53 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Vinicius Costa Gomes <vinicius.gomes@...el.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Michal Kubecek <mkubecek@...e.cz>,
        Claudiu Manoil <claudiu.manoil@....com>,
        Xiaoliang Yang <xiaoliang.yang_1@....com>,
        Kurt Kanzenbach <kurt@...utronix.de>,
        Rui Sousa <rui.sousa@....com>,
        Ferenc Fejes <ferenc.fejes@...csson.com>
Subject: Re: [RFC PATCH net-next 2/7] net: ethtool: add support for Frame
 Preemption and MAC Merge layer

On Wed, Aug 17, 2022 at 04:15:11PM -0700, Vinicius Costa Gomes wrote:
> I liked that in the API sense, using this "prio" concept we gain more
> flexibility, and we can express better what the hardware you work with
> can do, i.e. priority (for frame preemption purposes?) and queues are
> orthogonal.
> 
> The problem I have is that the hardware I work with is more limited (as
> are some stmmac-based NICs that I am aware of) frame preemption
> capability and "priority" are attached to queues.

Don't get me wrong, enetc also has "priority" attached to TX rings
(enetc_set_bdr_prio). Then there is another register which maps a TX
priority to a traffic class. This could be altered by the "map" property
of tc-mqprio, but in practice it isn't. The only supported prio->tc map
is "map 0 1 2 3 4 5 6 7".

This is to say, enetc does not have a per-packet priority that gets
passed via BD metadata, but rather, packets inherit the configured
priority of the ring (Linux TX queue).

> From the API perspective, it seems that I could say that "fp-prio" 0 is
> associated with queue 0, fp-prio 1 to queue 1, and so on, and everything
> will work.

You have the tc-mqprio "queues" and "map" to juggle with, to end up
configuring FP per queue based on the provided per-priority settings.

> The only thing that I am not happy at all is that there are exactly 8
> fp-prios.
> 
> The Linux network stack is more flexible than what 802.1Q defines, think
> skb->priority, number of TCs, as you said earlier, I would hate to
> impose some almost artificial limits here. And in future we are going to
> see TSN enabled devices with lots more queues.

~artificial limits~

IEEE 802.1Q says:

| 12.30.1.1 framePreemptionStatusTable structure and data types
| 
| The framePreemptionStatusTable (6.7.2) consists of 8 framePreemptionAdminStatus
| values (12.30.1.1.1), one per priority.

I'm more concerned about setting things straight than about the limits
right now. I don't yet think everything is quite ok in that regard.
The netlink format proposed here is in principle extensible for
priorities > 7, if that will ever make sense.

> 
> In short:
>  - Comment: this section of the RFC is hardware independent, this
>  behavior of queues and priorities being orthogonal is only valid for
>  some implementations;

Yes, and IMO it can only be that way, if I were to be truthful to my
interpretation of the intention 802.1Q spec (please contradict me if you
have a different interpretation of it!).

Note that I do see contradictions in 802.1Q, and I don't know how to
reconcile them. I've suppressed some of them for lack of a logical
explanation. I'm mentioning this for transparency; I don't know
everything either, but I need to make something out of what I do know.

So 802.1Q says this:

| 12.30.1.1.1 framePreemptionAdminStatus
|
| This parameter is the administrative value of the preemption status for
| the priority. It takes value express if frames queued for the priority
| are to be transmitted using the express service for the Port, or
| preemptible if frames queued for the priority are to be transmitted
| using the preemptible service for the Port and preemption is enabled for
| the Port.

So far so good. In 802.1Q definitions, priority is attached to a packet
rather than to a traffic class / queue. But then the same clause continues:

| Priorities that all map to the same traffic class should be constrained
| to use the same value of preemption status.

Which seems to throw a wrench into everything.

It raises two questions:

(A) why is AdminStatus not per traffic class then?
(B) why is the constraint there, what's it trying to protect against?

I've asked around, and I got unsatisfactory answers to both questions.

A seemingly competent answer given to (A) by Rui (CCed) is that the
eMAC/pMAC selection on TX actually takes place in the MAC layer, in what
is called by 802.1AC-2016 "MA_UNITDATA.request" (and what everybody else
calls "MAC client xmit"). The parameters of this MAC service primitive
are:

MA_UNITDATA.request(destination_address,
		    source_address,
		    mac_service_data_unit,
		    priority)

So since only "priority" is passed to the MAC service (and traffic class isn't),
this means that the MAC service can only steer packets to eMAC/pMAC
based on what it's given (i.e. priority).

But upon closer investigation, this explanation doesn't appear to hold
water very well. This is because clause 6.7.1 Support of the ISS by IEEE
Std 802.3 (Ethernet) from 802.1Q says:

| If frame preemption (6.7.2) is supported on a Port, then the IEEE 802.3
| MAC provides the following two MAC service interfaces (99.4 of IEEE Std
| 802.3brâ„¢-2016 [B21]):
|
| a) A preemptible MAC (pMAC) service interface
| b) An express MAC (eMAC) service interface
|
| For priority values that are identified in the frame preemption status
| table (6.7.2) as preemptible, frames that are selected for transmission
| shall be transmitted using the pMAC service instance, and for priority
| values that are identified in the frame preemption status table as
| express, frames that are selected for transmission shall be transmitted
| using the eMAC service instance.
| In all other respects, the Port behaves as if it is supported by a
| single MAC service interface. In particular, all frames received by the
| Port are treated as if they were received on a single MAC service
| interface regardless of whether they were received on the eMAC service
| interface or the pMAC service interface, except with respect to frame
| preemption.

So there you go, the MAC has to provide *two* service interfaces, so the
802.1Q upper layer (client of both) can just decide based on an internal
criterion (like, say traffic class) into which service it sends a frame.
So this can't be the reason.

As for (B), it was suggested to me that 802.1Q that doesn't allow out of
order transmission within the same queue/traffic class. After all,
what's at play here is whether a single TX queue device can support FP
or not. Supporting FP would mean reordering PT frames relative to ET
frames.

I think this explanation is unsatisfactory too. Here's the only
reference I saw in 802.1Q to ordering guarantees. Basically those
guarantees are all *per priority*, and since PT/ET is also per priority,
I don't see why reordering there would violate this:

| 6.5.3 Frame misordering
| The MAC Service (IEEE Std 802.1AC) permits a negligible rate of
| reordering of frames with a given priority for a given combination of
| destination address, source address, and flow hash, if present,
| transmitted on a given VLAN.
| MA_UNITDATA.indication service primitives corresponding to
| MA_UNITDATA.request primitives, with the same requested priority and for
| the same combination of VLAN classification, destination address, source
| address, and flow hash, if present, are received in the same order as
| the request primitives were processed.

So for anything to make sense for me at all, I simply had to block out
that phrase from my mind. I'm posting the concern here, publicly, in
case someone can enlighten me.

>  - Question: would it be against the intention of the API to have a 1:1
>  map of priorities to queues?

No; as mentioned, going through mqprio's "map" and "queues" to resolve
the priority to a queue is something that I intended to be possible.
Sure, it's not handy either. It would have been handier if the
"admin-status" array was part of the tc-mqprio config, like you did in
your RFC.

But right now I'm trying to not close the possibility for single queue
devices (which won't implement mqprio) to support FP, since I haven't
seen anything convincing that would disprove such a hw design as
infeasible. If someone could share some compelling insight into this it
would be really appreciated.

>  - Deal breaker: fixed 8 prios;

idk, I don't think that's our biggest problem right now honestly.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ