[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YwKeVQWtVM9WC9Za@DEN-LT-70577>
Date: Sun, 21 Aug 2022 20:58:05 +0000
From: <Daniel.Machon@...rochip.com>
To: <petrm@...dia.com>
CC: <netdev@...r.kernel.org>, <kuba@...nel.org>,
<vinicius.gomes@...el.com>, <vladimir.oltean@....com>,
<thomas.petazzoni@...tlin.com>, <Allan.Nielsen@...rochip.com>,
<maxime.chevallier@...tlin.com>, <nikolay@...dia.com>,
<roopa@...dia.com>
Subject: Re: Basic PCP/DEI-based queue classification
Hi Petr,
Thank you for your answer.
> > Hi netdev,
> >
> > I am posting this thread in continuation of:
> >
> > https://lore.kernel.org/netdev/20220415173718.494f5fdb@fedora/
> >
> > and as a new starting point for further discussion of offloading PCP-based
> > queue classification into the classification tables of a switch.
> >
> > Today, we use a proprietary tool to configure the internal switch tables for
> > PCP/DEI and DSCP based queue classification [1]. We are, however, looking for
> > an upstream solution.
> >
> > More specifically we want an upstream solution which allows projects like DENT
> > and others with similar purpose to implement the ieee802-dot1q-bridge.yang [2].
> > As a first step we would like to focus on the priority maps of the "Priority
> > Code Point Decoding Table" and "Priority Code Point Enconding table" of the
> > 802.1Q-2018 standard. These tables are well defined and maps well to the
> > hardware.
> >
> > The purpose is not to create a new kernel interface which looks like what IEEE
> > defines - but rather to do the needed plumbing to allow user-space tools to
> > implement an interface like this.
> >
> > In essence we need an upstream solution that initially supports:
> >
> > - Per-port mapping of PCP/DEI to QoS class. For both ingress and egress.
> >
> > - Per-port default priority for frames which are not VLAN tagged.
>
> This exists in DCB APP. Rules with selector 1 (EtherType) and PID 0
> assign a default priority. iproute2's dcb tool supports this.
>
> > - Per-port configuration of "trust" to signal if the VLAN-prio shall be used,
> > or if port default priority shall be used.
>
> This would be nice. Currently mlxsw ports are in trust PCP mode until
> the user configures any DSCP rules. Then it switches to trust DSCP.
> There's no way to express "trust both", or to configure the particular
> PCP mapping for trust PCP (it's just hardcoded as 1:1).
Right, so this could be of use by you guys as well.
>
> Re this "VLAN or default", note it's not (always) either-or. In Spectrum
> switches, the default priority is always applicable. E.g. for a port in
> trust PCP mode, if a packet has no 802.1q header, it gets port-default
> priority. 802.1q describes the default priority as "for use when
> application priority is not otherwise specified", so I think this
> behavior actually matches the standard.
>
> > In the old thread, Maxime has compiled a list of ways we can possibly offload
> > the queue classification. However none of them is a good match for our purpose,
> > for the following reasons:
> >
> > - tc-flower / tc-skbedit: The filter and action scheme maps poorly to hardware
> > and would require one hardware-table entry per rule. Even less of a match
> > when DEI is also considered. These tools are well suited for advanced
> > classification, and not so much for basic per-port classification.
>
> Yeah.
>
> Offloading this is a pain. You need to parse out the particular shape of
> rules (which is not a big deal honestly), and make sure the ordering of
> the rules is correct and matches what the HW is doing. And tolerate any
> ACL-/TCAM- like rules as well. And there's mismatch between how a
> missing rule behaves in SW (fall-through) and HW (likely priority 0 gets
> assigned).
>
> And configuration is pain as well, because a) it's a whole bunch of
> rules to configure, and b) you need to be aware of all the limitations
> from the previous paragraph and manage the coexistence with ACL/TCAM
> rules.
>
> It's just not a great story for this functionality.
>
> I wonder if a specialized filter or action would make things easier to
> work with. Something like "matchall action dcb dscp foo bar priority 7".
>
I really think that pcp mapping should not go into tc. It is just not
user-friendly at all, and I believe better alternatives exists.
> > - ip-link: The ingress and egress maps of ip-link is per-linux-vlan interface;
> > we need per-port mapping. Not possible to map both PCP and DEI.
> >
> > - dcb-app: Not possible to map PCP/DEI (only DSCP).
> >
> > We have been looking around the kernel to snoop what other switch driver
> > developers do, to configure basic per-port PCP/DEI based queue classification,
> > and have not been able to find anything useful, in the standard kernel
> > interfaces. It seems like people use their own out-of-tree tools to configure
> > this (like mlnx_qos from Mellanox [3]).
> >
> > Finally, we would appreciate any input to this, as we are looking for an
> > upstream solution that can be accepted by the community. Hopefully we can
> > arrive at some consensus on whether this is a feature that can be of general
> > use by developers, and furthermore, in which part of the kernel it should
> > reside:
> >
> > - ethtool: add new setting to configure the pcp tables (seems like a good
> > candidate to us).
> >
> > - ip-link: add support for per-port-interface ingress and egress mapping of
> > pcp/dei
> >
> > - dcb-*: as an extension or new command to the dcb utilities. The pcp tables
> > seems to be in line with what dcb-app does with the application priority
> > table.
>
> I'm not a fan of DCB, but the TC story is so unconvincing that this
> looks good in comparison.
>
Agree.
> But note that DCB as such is standardized. I think the dcb-maxrate
> interfaces are not, and the DCB subsystem has a whole bunch of weird
> pre-standard stuff that's not exposed. But what's in iproute2 dcb is
> largely standard. So maybe this should be hidden under some extension
> attribute.
>
So a pcp mapping functionality could very well go into dcb as an extension,
for the following reasons:
- dcb already contains non-standard extension (dcb-maxrate)
- Adding an extension (dcb-pcp?) for configuring the pcp tables of ieee-802.1q
seems to be in line with what dcb-app is doing with the app table. Now, the
app table and the pcp tables are different, but they are both inteded to map
priority to queue (dscp or pcp/dei).
- default prio for non-tagged frames is already possible in dcb-app
- dscp priority mapping is also possible in dcb-app
- dcb already has the necessary data structures for mapping priority to queue
(array parameter)
- Seems conventient to place the priority mapping in one place (dscp and pcp/dei).
Any thoughts?
> > - somewhere else
> >
> > In summary:
> >
> > - We would like feedback from the community on the suggested implemenation of
> > the ieee-802.1Q Priority Code Point encoding an decoding tables.
> >
> > - And if we can agree that such a solution could and should be implemented;
> > where should the implemenation go?
> >
> > - Also, should the solution be supported in the sw-bridge as well.
>
> That would be ideal, yeah.
Powered by blists - more mailing lists