lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3b25fcd8-ec7e-40ba-8432-e1d489b12875@blackwall.org>
Date: Sat, 6 Sep 2025 21:16:03 +0300
From: Nikolay Aleksandrov <razor@...ckwall.org>
To: Petr Machata <petrm@...dia.com>, "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org
Cc: Simon Horman <horms@...nel.org>, Ido Schimmel <idosch@...dia.com>,
 bridge@...ts.linux.dev, mlxsw@...dia.com
Subject: Re: [PATCH net-next 00/10] bridge: Allow keeping local FDB entries
 only on VLAN 0

On 9/4/25 20:07, Petr Machata wrote:
> The bridge FDB contains one local entry per port per VLAN, for the MAC of
> the port in question, and likewise for the bridge itself. This allows
> bridge to locally receive and punt "up" any packets whose destination MAC
> address matches that of one of the bridge interfaces or of the bridge
> itself.
> 
> The number of these local "service" FDB entries grows linearly with number
> of bridge-global VLAN memberships, but that in turn will tend to grow
> quadratically with number of ports and per-port VLAN memberships. While
> that does not cause issues during forwarding lookups, it does make dumps
> impractically slow.
> 
> As an example, with 100 interfaces, each on 4K VLANs, a full dump of FDB
> that just contains these 400K local entries, takes 6.5s. That's _without_
> considering iproute2 formatting overhead, this is just how long it takes to
> walk the FDB (repeatedly), serialize it into netlink messages, and parse
> the messages back in userspace.
> 
> This is to illustrate that with growing number of ports and VLANs, the time
> required to dump this repetitive information blows up. Arguably 4K VLANs
> per interface is not a very realistic configuration, but then modern
> switches can instead have several hundred interfaces, and we have fielded
> requests for >1K VLAN memberships per port among customers.
> 
[snip]
> All this FDB duplication is there merely to make things snappy during
> forwarding. But high-radix switches with thousands of VLANs typically do
> not process much traffic in the SW datapath at all, but rather offload vast
> majority of it. So we could exchange some of the runtime performance for a
> neater FDB.
> 
> To that end, in this patchset, introduce a new bridge option,
> BR_BOOLOPT_FDB_LOCAL_VLAN_0, which when enabled, has local FDB entries
> installed only on VLAN 0, instead of duplicating them across all VLANs.
> Then to maintain the local termination behavior, on FDB miss, the bridge
> does a second lookup on VLAN 0.
> 
> Enabling this option changes the bridge behavior in expected ways. Since
> the entries are only kept on VLAN 0, FDB get, flush and dump will not
> perceive them on non-0 VLANs. And deleting the VLAN 0 entry affects
> forwarding on all VLANs.
> 
> This patchset is loosely based on a privately circulated patch by Nikolay
> Aleksandrov.
> 

I knew this sounded familiar, I actually did try to upstream the original patch[1] way back
in 2015 and was rejected, at the time that led to the vlan rhashtable code. :-)

By the way the original idea and change predate me and were by Wilson Kok, I just polished
them and took over the patch while at Cumulus.

Now, this is presented in a much shinier new option manner with selftests which is great.
I think we can take the new option this time around, it will be very helpful for some
setups as explained.

The code looks good to me, I appreciate how well split it is.
For the series:

Acked-by: Nikolay Aleksandrov <razor@...ckwall.org>

Thanks,
  Nik

[1] https://lore.kernel.org/netdev/1440549295-3979-1-git-send-email-razor@blackwall.org/

> The patchset progresses as follows:
> 
> - Patch #1 introduces a bridge option to enable the above feature. Then
>    patches #2 to #5 gradually patch the bridge to do the right thing when
>    the option is enabled. Finally patch #6 adds the UAPI knob and the code
>    for when the feature is enabled or disabled.
> - Patches #7, #8 and #9 contain fixes and improvements to selftest
>    libraries
> - Patch #10 contains a new selftest
> 
> The corresponding iproute2 support is at:
> https://github.com/pmachata/iproute2/commits/fdb_local_vlan_0/
> 
> Petr Machata (10):
>    net: bridge: Introduce BROPT_FDB_LOCAL_VLAN_0
>    net: bridge: BROPT_FDB_LOCAL_VLAN_0: Look up FDB on VLAN 0 on miss
>    net: bridge: BROPT_FDB_LOCAL_VLAN_0: On port changeaddr, skip per-VLAN
>      FDBs
>    net: bridge: BROPT_FDB_LOCAL_VLAN_0: On bridge changeaddr, skip
>      per-VLAN FDBs
>    net: bridge: BROPT_FDB_LOCAL_VLAN_0: Skip local FDBs on VLAN creation
>    net: bridge: Introduce UAPI for BR_BOOLOPT_FDB_LOCAL_VLAN_0
>    selftests: defer: Allow spaces in arguments of deferred commands
>    selftests: defer: Introduce DEFER_PAUSE_ON_FAIL
>    selftests: net: lib.sh: Don't defer failed commands
>    selftests: forwarding: Add test for BR_BOOLOPT_FDB_LOCAL_VLAN_0
> 
>   include/uapi/linux/if_bridge.h                |   3 +
>   net/bridge/br.c                               |  22 ++
>   net/bridge/br_fdb.c                           | 114 +++++-
>   net/bridge/br_input.c                         |   8 +
>   net/bridge/br_private.h                       |   3 +
>   net/bridge/br_vlan.c                          |  10 +-
>   .../testing/selftests/net/forwarding/Makefile |   1 +
>   .../net/forwarding/bridge_fdb_local_vlan_0.sh | 374 ++++++++++++++++++
>   tools/testing/selftests/net/lib.sh            |  32 +-
>   tools/testing/selftests/net/lib/sh/defer.sh   |  20 +-
>   10 files changed, 559 insertions(+), 28 deletions(-)
>   create mode 100755 tools/testing/selftests/net/forwarding/bridge_fdb_local_vlan_0.sh
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ