netdev - Re: [PATCH net 1/2] net: dsa: tag_dsa: send packets with TX fwd offload from VLAN-unaware bridges using VID 0

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87y278lx80.fsf@waldekranz.com>
Date:   Mon, 04 Oct 2021 15:45:51 +0200
From:   Tobias Waldekranz <tobias@...dekranz.com>
To:     Vladimir Oltean <vladimir.oltean@....com>
Cc:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Fainelli <f.fainelli@...il.com>,
        Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>
Subject: Re: [PATCH net 1/2] net: dsa: tag_dsa: send packets with TX fwd
 offload from VLAN-unaware bridges using VID 0

On Mon, Oct 04, 2021 at 11:16, Vladimir Oltean <vladimir.oltean@....com> wrote:
> On Mon, Oct 04, 2021 at 12:55:27PM +0200, Tobias Waldekranz wrote:
>> On Mon, Oct 04, 2021 at 01:23, Vladimir Oltean <vladimir.oltean@....com> wrote:
>> > The present code is structured this way due to an incomplete thought
>> > process. In Documentation/networking/switchdev.rst we document that if a
>> > bridge is VLAN-unaware, then the presence or lack of a pvid on a bridge
>> > port (or on the bridge itself, for that matter) should not affect the
>> > ability to receive and transmit tagged or untagged packets.
>> >
>> > If the bridge on behalf of which we are sending this packet is
>> > VLAN-aware, then the TX forwarding offload API ensures that the skb will
>> > be VLAN-tagged (if the packet was sent by user space as untagged, it
>> > will get transmitted town to the driver as tagged with the bridge
>> > device's pvid). But if the bridge is VLAN-unaware, it may or may not be
>> > VLAN-tagged. In fact the logic to insert the bridge's PVID came from the
>> > idea that we should emulate what is being done in the VLAN-aware case.
>> > But we shouldn't.
>> 
>> IMO, the problem here stems from a discrepancy between LinkStreet
>> devices and the bridge, in how PVID is interpreted. For the bridge, when
>> VLAN filtering is disabled, ingressing traffic will be assigned to VID
>> 0. This is true even if the port's PVID is set. A mv88e6xxx port who's
>> QMode bits are set to 00 (802.1Q disabled) OTOH, will assign ingressing
>> traffic to its PVID.
>> 
>> So, in order to match the bridge's behavior, I think we need to rethink
>> how mv88e6xxx deals with non-filtering bridges. At first, one might be
>> tempted to simply leave the hardware PVID at 0. The PVT can then be used
>> to create isolation barriers between different bridges. ATU isolation is
>> really what kills this approach. Since there is no VLAN information in
>> the tag, there is no way to separate flows from different bridges into
>> different FIDs. This is the issue I discovered with the forward
>> offloading series.
>> 
>> > It appears that injecting packets using a VLAN ID of 0 serves the
>> > purpose of forwarding the packets to the egress port with no VLAN tag
>> > added or stripped by the hardware, and no filtering being performed.
>> > So we can simply remove the superfluous logic.
>> 
>> The problem with this patch is that return traffic from the CPU is sent
>> asymmetrically over a different VLAN, which in turn means that it will
>> perform the DA lookup in a different FID (0). The result is that traffic
>> does flow, but for the wrong reason. CPU -> port traffic is now flooded
>> as unknown unicast. An example:
>> 
>> (:aa / 10.1)
>>     br0
>>    /   \
>> sw0p1 sw0p2
>> \         /
>>  \       /
>>   \     /
>>     CPU
>>      |
>>   .--0--.
>>   | sw0 |
>>   '-1-2-'
>>     | '-- sniffer
>>     '---- host (:bb / 10.2)
>> 
>> br0 is created using the default settings. sw0 will have (among others)
>> static entries for the CPU:
>> 
>>     fid:0 addr:aa type:static port:0
>>     fid:1 addr:aa type:static port:0
>> 
>> 1. host sends an ARP for 10.1.
>> 
>> 2. sw0 will add this entry (since vlan_default_pvid is 1):
>> 
>>     fid:1 addr:bb type:age-7 port:1
>
> Well, that's precisely mv88e6xxx's problem, it should not make its
> ports' pvid inherit that of the bridge if the bridge is not VLAN aware.
> Other drivers inherit the bridge pvid only when VLAN filtering is turned
> on. See sja1105, ocelot, mt7530 at the very least. So the entry should
> have been learned in FID 0 here.
>
>> 3. CPU replies with a FORWARD (VID 0).
>> 
>> 4. sw0 will perform a DA lookup in FID 0, missing the entry learned in
>>    step 2.
>> 
>> 5. sw0 floods the frame as unknown unicast to both host and sniffer.
>> 
>> Conversely, if flooding of unknown unicast is disabled on sw0p1:
>> 
>>     $ bridge link set dev sw0p1 flood off
>> 
>> host can no longer communicate with the CPU.
>> 
>> As I alluded to in the forward offloading thread, I think we need to
>> move a scheme where:
>> 
>> 1. mv88e6xxx clears ds->configure_vlan_while_not_filtering.
>
> No, that's the wrong answer, nobody should clear ds->configure_vlan_while_not_filtering.
> mv88e6xxx should leave the pvid at zero* when joining a bridge that is
> not VLAN-aware. It should inherit the bridge pvid when that bridge
> becomes VLAN-aware, and it should reset the pvid to zero* when that
> bridge becomes VLAN-unaware.

Fair enough, even better!

>> 2. Assigns a free VID (and by extension a FID) in the VTU to each
>>    non-filtering bridge.
>
> *with the mention that the pvid of zero will only solve the first half
> of the problem, the discrepancy between the VLAN classified on xmit and
> the VLAN classified on rcv.
>
> It will not solve the ATU (FDB) isolation problem. But to solve the FDB
> isolation problem you need this:
> https://patchwork.kernel.org/project/netdevbpf/cover/20210818120150.892647-1-vladimir.oltean@nxp.com/
>
>> With this in place, the tagger could use the VID associated with the
>> egressing port's bridge in the tag.
>
> So the patch is not incorrect, it is incomplete. And there's nothing
> further I can add to the tagger logic to make it more complete, at least
> not now.
>
> That's one of the reasons why this is merely a "part 1".

Understood. But perhaps you could add the PVID-wrangling-patch you
suggested above to this series? That way we don't surprise any users on
stable by suddenly flooding traffic that used to be forwarded.