netdev - Re: [RFC PATCH 4/4] net: dsa: tag_edsa: support reception of packets from lag devices

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <C6OVPVXHQ5OA.21IJYAHUW1SW4@wkz-x280>
Date:   Wed, 28 Oct 2020 23:31:58 +0100
From:   "Tobias Waldekranz" <tobias@...dekranz.com>
To:     "Vladimir Oltean" <olteanv@...il.com>
Cc:     <andrew@...n.ch>, <vivien.didelot@...il.com>,
        <f.fainelli@...il.com>, <netdev@...r.kernel.org>,
        "Ido Schimmel" <idosch@...sch.org>
Subject: Re: [RFC PATCH 4/4] net: dsa: tag_edsa: support reception of
 packets from lag devices

On Wed Oct 28, 2020 at 9:18 PM CET, Vladimir Oltean wrote:
> Let's say you receive a packet on the standalone swp0, and you need to
> perform IP routing towards the bridged domain br0. Some switchdev/DSA
> ports are bridged and some aren't.
>
> The switchdev/DSA switch will attempt to do the IP routing step first,
> and it _can_ do that because it is aware of the br0 interface, so it
> will decrement the TTL and replace the L2 header.
>
> At this stage we have a modified IP packet, which corresponds with what
> should be injected into the hardware's view of the br0 interface. The
> packet is still in the switchdev/DSA hardware data path.
>
> But then, the switchdev/DSA hardware will look up the FDB in the name of
> br0, in an attempt of finding the destination port for the packet. But
> the packet should be delivered to a station connected to eth0 (e1000,
> foreign interface). So that's part of the exception path, the packet
> should be delivered to the CPU.
>
> But the packet was already modified by the hardware data path (IP
> forwarding has already taken place)! So how should the DSA/switchdev
> hardware deliver the packet to the CPU? It has 2 options:
>
> (a) unwind the entire packet modification, cancel the IP forwarding and
> deliver the unmodified packet to the CPU on behalf of swp0, the
> ingress port. Then let software IP forwarding plus software bridging
> deal with it, so that it can reach the e1000.
> (b) deliver the packet to the CPU in the middle of the hardware
> forwarding data path, where the exception/miss occurred, aka deliver
> it on behalf of br0. Modified by IP forwarding. This is where we'd
> have to manually inject skb->dev into br0 somehow.

The thing is, unlike L2 where the hardware will add new neighbors to
its FDB autonomously, every entry in the hardware FIB is under the
strict control of the CPU. So I think you can avoid much of this
headache simply by determining if a given L3 nexthop/neighbor is
"foreign" to the switch or not, and then just skip offloading for
those entries.

You miss out on the hardware acceleration of replacing the L2 header
of course. But my guess would be that once you have payed the tax of
receiving the buffer via the NIC driver, allocated an skb, and called
netif_rx() etc. the routing operation will be a rounding error. At
least on smaller devices where the FIB is typically quite small.

> Maybe this sounds a bit crazy, considering that we don't have IP
> forwarding hardware with DSA today, and I am not exactly sure how other
> switchdev drivers deal with this exception path today. But nonetheless,
> it's almost impossible for DSA switches with IP forwarding abilities to
> never come up some day, so we ought to have our mind set about how the
> RX data path should like, and whether injecting directly into an upper
> is icky or a fact of life.

Not crazy at all. In fact the Amethyst (6393X), for which there is a
patchset available on netdev, is capable of doing this (the hardware
is - the posted patches do not implement it).

> Things get even more interesting when this is a cascaded DSA setup, and
> the bridging and routing are cross-chip. There, the FIB/FDB of 2 there
> isn't really any working around the problem that the packet might need
> to be delivered to the CPU somewhere in the middle of the data path, and
> it would need to be injected into the RX path of an upper interface in
> that case.
>
> What do you think?