netdev - Re: [PATCH net-next 3/5 v2] net: dsa: rtl8366rb: Support disabling learning

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210908210939.cwwnwgj3p67qvsrh@skbuf>
Date:   Thu, 9 Sep 2021 00:09:39 +0300
From:   Vladimir Oltean <olteanv@...il.com>
To:     Linus Walleij <linus.walleij@...aro.org>
Cc:     Andrew Lunn <andrew@...n.ch>,
        Vivien Didelot <vivien.didelot@...il.com>,
        Florian Fainelli <f.fainelli@...il.com>,
        "David S . Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        netdev <netdev@...r.kernel.org>,
        Alvin Šipraga <alsi@...g-olufsen.dk>,
        Mauri Sandberg <sandberg@...lfence.com>,
        DENG Qingfang <dqfext@...il.com>
Subject: Re: [PATCH net-next 3/5 v2] net: dsa: rtl8366rb: Support disabling
 learning

On Tue, Sep 07, 2021 at 05:52:01PM +0200, Linus Walleij wrote:
> On Tue, Aug 31, 2021 at 12:40 AM Vladimir Oltean <olteanv@...il.com> wrote:
> 
> > >       /* Enable learning for all ports */
> > > -     ret = regmap_write(smi->map, RTL8366RB_SSCR0, 0);
> > > +     ret = regmap_write(smi->map, RTL8366RB_PORT_LEARNDIS_CTRL, 0);
> >
> > So the expected behavior for standalone ports would be to _disable_
> > learning. In rtl8366rb_setup, they are standalone.
> 
> OK I altered the code such that I disable learning on all ports instead,
> the new callback can be used to enable it.
> 
> What about the CPU port here? Sorry if it is a dumb question...
> 
> I just disabled learning on all ports including the CPU port, but
> should I just leave learning on on the CPU port?

Not a dumb question. DSA does not change the learning state of the CPU
port and lets drivers choose among two solutions:

(a) enable hardware address learning manually on it. This was the
    traditional approach, and the exact meaning of what this actually
    does will vary from one switch vendor to another, so you need to
    know what you get and if it complies with the network stack's expectations.
    At one end, you have ASICs where despite this setting, the {MAC SA,
    VLAN ID} will not be learned for packets injected towards a precise
    destination port specified in the DSA tag (so-called "control
    packets"). These ASICs would only learn the {MAC SA, VLAN ID} from
    packets sent from the CPU that are not injected precisely into a
    port, but forwarded freely according to the FDB (in the case of your
    Realtek tagging protocol, think of it as an all-zeroes destination
    port mask).  These are so-called "data plane packets", and DSA has
    traditionally not had any means for building/sending one, hence the
    complete lack of address learning for this type of switches,
    practically.
    On the other end you will have switches which will learn the {MAC
    SA, VLAN ID} from both control and data packets, with no way of
    controlling which one gets learned and which one doesn't.  This will
    further cause problems when you consider that some packets might
    need to be partially forwarded in software between switch ports that
    are members of a hardware bridge. In general, address learning must
    be recognized as a bridge layer function, and only packets
    pertaining to the data plane of a bridge must be subject to
    learning. Ergo: if you inject a packet into a standalone port, its
    MAC SA should not be learned (for reasons that have real life
    implications, but are a bit beside the point to detail here).
    There might also exist in-between hardware implementations, where
    there might be a global learning setting for the CPU port, with an
    override per packet (a bit in the DSA tag).

(b) implement .port_fdb_add and .port_fdb_del and then set
    ds->assisted_learning_on_cpu_port = true. This was created exactly
    for the category of switches that won't learn from CPU-injected
    traffic even when asked to do so, but has since been extended to
    cover use cases even for switches where that is a possibility.
    For example, in a setup with multiple CPU ports (and this includes
    obscure setups where every switch has a single CPU port, but is part
    of a multi-chip topology), hardware address learning on any
    individual CPU port does not make sense, because that learned
    address would keep bouncing back and forth between one CPU port and
    the other, when it really should have stayed fixed on both.
    The way this option works is that it tells DSA to listen for
    switchdev events emitted by the bridge for FDB entries that were
    learned by the software path and point towards the general direction
    of the bridge (towards the bridge itself, or towards a non-DSA
    interface in the same bridge as a DSA port, like a Wi-Fi AP).
    These software FDB entries will then be installed through .port_fdb_add
    as static FDB entries on the CPU port(s), and removed when they are no
    longer needed. Drivers which support multiple CPU ports should then
    program these FDB entries to hardware in a way in which installing
    the entry towards CPU port B will not override a pre-existing FDB
    entry that was pointing towards CPU port A, but instead just append
    to the destination port mask of any pre-existing FDB entry. This is
    also the pattern used when programming multicast (MDB) entries to
    hardware.
    This solution also addresses the observation that packets injected
    towards standalone ports should not result in the {MAC SA, VLAN ID}
    getting learned, due to a combination of two factors:
    - When you set ds->assisted_learning_on_cpu_port = true, you must
      disable hardware learning on the CPU port(s)
    - Since the only FDB entries installed on the CPU port are ones
      originating from switchdev events emitted by a bridge, learning is
      limited to the bridging layer

It is understandable if you do not want to opt into ds->assisted_learning_on_cpu_port,
and if the way in which your DSA tagging protocol injects packets into
the hardware will always remain as "control packets", this might not
even show up any problem. Typically, "control packets" will be sent to
the destination port mask from the DSA tag with no questions asked. That
is to say, if you want to xmit a packet with a given {MAC DA, VLAN ID}
towards port 0, but the same {MAC DA, VLAN ID} was already learned on
the CPU port, the switch will be happy to send it towards port 0
precisely because it's a "control packet" and it will not look up the
FDB for it. With "data packets" it's an entirely different story, the
FDB will be looked up, and the switch will say "hey, I received a packet
with this MAC DA from the CPU port, but my FDB says the destination is
the CPU port. I am told to not reflect packets back from where they came
from, so just drop it". If you are sure that neither you nor anyone else
will ever implement support for data plane packets (bridge tx_fwd_offload)
for this hardware, then statically enabling hardware learning on the CPU
port might be enough if it actually works.

There is a third option where you can still learn the {MAC SA, VLAN ID}
in-band from bridging layer packets (and not from packets sent to standalone ports),
without the overhead of extra SPI/MDIO/I2C transactions that the assisted
learning feature requires.
This third option is available if you can tell the switch to learn on a
per-packet basis, and then to implement the tx_fwd_offload bridge feature.
The idea is to tell the hardware, through the DSA tag on xmit, to only
learn the MAC SA from packets which are sent on behalf of a bridge
(these will have skb->offload_fwd_mark == true).

So in the end, you have quite a bit of choice. First of all I would
start by figuring out what the hardware is capable of doing.

Enable the hardware learning bit on the CPU port, then build this setup:

           192.168.100.3
                br0
               /   \
              /     \
             /       \
         swp0         swp1
          |            |
       Station A    Station B
    192.168.100.1   192.168.100.2


And ping br0 from Station A, attached to the swp0 of your board.

Then tcpdump on Station B, to see all packets.
If you see the ICMP requests sent by Station A towards br0, it means
that the switch has not learned the MAC SA from br0's ICMP replies, and
it basically doesn't know where the MAC address of br0 is. So it will
always flood the packets targeting br0, towards all ports in br0's
forwarding domain. "All ports" includes swp1 in this case, the curious
neighbor.

With address learning functioning properly on the CPU port, you will
only see an initial pair of packets being flooded, that being during a
time when the switch does not know what the intended destination is,
because it has not seen that MAC DA in the MAC SA field of any prior
packets on any ports.