lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Mon, 7 Nov 2022 20:07:41 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     Maxime Chevallier <maxime.chevallier@...tlin.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        Rob Herring <robh+dt@...nel.org>,
        Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
        Eric Dumazet <edumazet@...gle.com>,
        Paolo Abeni <pabeni@...hat.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "devicetree@...r.kernel.org" <devicetree@...r.kernel.org>,
        "thomas.petazzoni@...tlin.com" <thomas.petazzoni@...tlin.com>,
        Andrew Lunn <andrew@...n.ch>,
        Florian Fainelli <f.fainelli@...il.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Russell King <linux@...linux.org.uk>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        Luka Perkov <luka.perkov@...tura.hr>,
        Robert Marko <robert.marko@...tura.hr>,
        Andy Gross <agross@...nel.org>,
        Bjorn Andersson <andersson@...nel.org>,
        Konrad Dybcio <konrad.dybcio@...ainline.org>
Subject: Re: [PATCH net-next v8 3/5] net: dsa: add out-of-band tagging
 protocol

On Mon, Nov 07, 2022 at 10:24:40AM -0800, Jakub Kicinski wrote:
> IIRC the skb extensions were initially proposed as a way to handle rare
> exception packets (e.g. first packet of a connection-tracked flow in OvS
> offloads). Also MPTCP but that's also edge / slow path (sorry MPTCP).
>
> Now the usage is spreading and I have to keep fighting to keep them out
> of the datacenter production kernel I co-maintain.
>
> So, yeah, I hate them :)

To be fair, most people see CPU-terminated traffic on a DSA switch
(i.e. what you see in net/dsa/tag_*.c) as slow path which needs no
optimization. It's not that I condone this, but it's factually true.
If it wasn't the case, then out of the drivers I maintain, a control
packet wouldn't be delivered via SPI on SJA1105, and flow control
wouldn't be broken on the CPU port of switches from the Ocelot family.
And more importantly, software bridging between a switchdev and a
non-switchdev port wouldn't be such an oversight for more than 3/4 of
all switch drivers.

Also, I didn't really get *why* you hate them, just that you do.
Seems circular: slow => hate; hate => slow?

I don't think that skb->_skb_refdst is the hallmark of clean or simple
designs either, a pointer and a refcount bit squashed into a single
sk_buff field that is also in a union with 2 other things, and which is
reused in other network layers for purposes that have nothing to do with
L3 routing. Nope, sorry, this is highly optimized design at its finest,
true, but I have no interest in doing mental gymnastics in order to
maintain such a thing, just because some hardware manufacturer thought
that it would be a smart idea to split up device ownership in this way,
and neither build a 'switch with rings' nor a 'switch with tags', but
rather 'a switch with somebody else's rings'. The people who built this
monstrosity should step in and maintain the software architecture that's
a direct consequence of their design choices. Otherwise I'm going to opt
for the simplest thing to maintain that works. It's unfair to not care
about software support for your own hardware enough to study frameworks
beforehand, *and* to complain about performance.

> > > > The latter are more general; AFAIK they can be used between any layer
> > > > and any other layer, like for example between RX and TX in the
> > > > forwarding path.
> > >
> > > You can't be using lower-dev / upper-dev metadata across forwarding,
> > > how would that ever work?
> >
> > What makes metadata dst's preferable to skb extensions?
> >            ~~~~~~~~~~~~                 ~~~~~~~~~~~~~~
> >            former                       latter
> >
> > I said: "The latter [aka skb extensions, not metadata dst's] are more general".
> > I did not say that you can use metadata dst's across forwarding, quite
> > the opposite.
>
> No, no, I'm asking how you'd use either. I'm questioning the entire
> flow, not whether either mechanism can be used to fulfill it.

Well, we are probably talking about different things. I said that skb
extensions are a more general concept which *allows* you to pass metadata
from the iif to the oif. Metadata dst's don't. So what is the need for
metadata dst's, if skb extensions can do what those can do, and more.
Not that this use case is particularly relevant to DSA OOB. Just that I
think a reasonable expectation would have been to make skb extensions
more performant, than to introduce a parallel mechanism.

> TBH I mostly have experience on the Tx side, given that on the Rx side
> there is no queuing so the entire abstraction of tag implementation
> being separate is not strictly necessary. But if you find that the Rx
> doesn't work, and you really want the skb extensions - then, well,
> I acquiesce. And hope the Meta prod kernel never needs OOB DSA :)

I would never enable this feature, either. I would love not having to
see it.

> > > > More importantly, what happens if a DSA switch is used together with a
> > > > SRIOV-capable DSA master which already uses METADATA_HW_PORT_MUX for
> > > > PF-VF communication? (if I understood the commit message of 3fcece12bc1b
> > > > ("net: store port/representator id in metadata_dst") correctly)
> > >
> > > Let's be clear that the OOB metadata model only works if both upper and
> > > lower are aware of the metadata. In fact they are pretty tightly bound.
> > > So chances of a mismatch are extremely low and theorizing about them is
> > > academic.
> >
> > Legally I'm not allowed to say too much, but let's say I've heard about
> > something which makes the above not theoretical. Anyway, let's assume
> > it's not a concern.
>
> But in that case the same vendor designs both ends, right?

Yes.

> So there should be no conflict between the metadata assigned for reprs
> vs dsa ports.

Can't say more, sorry.

> > > In general, I'm not sure if pretending this is DSA is not an unnecessary
> > > complication which will end up hurting both ends of the equation.
> >
> > This is a valid point. We've refused wacky "not DSA, not switchdev"
> > hardware before:
> > https://lore.kernel.org/netdev/20201125232459.378-1-lukma@denx.de/
> > There's also the option of doing what I did with ocelot/felix: a common
> > switch lib and 2 distinct front-ends, one switchdev and one DSA.
>
> Exactly.
>
> > Not a lot of people seem to be willing to put that effort in, though.
> > The imx28 patch set was eventually abandoned. I though I'd try a
> > different approach this time. Idk, maybe it's a waste of time.
>
> Yeah, it's a balancing act. Please explore the metadata option, I think
> most people jump to the skb extension because they don't know about
> metadata. If you still want skb extension after, I'll look away.

Well, I guess I'm still not really convinced about metadata_dst, you're
still not really convinced about skb extensions, but what we have in
common is that "one switch lib, two front ends" is an alternative worth
exploring as a design that's both clean and efficient? :)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ