lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240214164559.njyaoscx2e22esep@skbuf>
Date: Wed, 14 Feb 2024 18:45:59 +0200
From: Vladimir Oltean <olteanv@...il.com>
To: Tobias Waldekranz <tobias@...dekranz.com>
Cc: davem@...emloft.net, kuba@...nel.org, atenart@...nel.org,
	roopa@...dia.com, razor@...ckwall.org, bridge@...ts.linux.dev,
	netdev@...r.kernel.org, jiri@...nulli.us, ivecera@...hat.com
Subject: Re: [PATCH v4 net 1/2] net: bridge: switchdev: Skip MDB replays of
 deferred events on offload

On Mon, Feb 12, 2024 at 08:18:43PM +0100, Tobias Waldekranz wrote:
> Before this change, generation of the list of MDB events to replay
> would race against the creation of new group memberships, either from
> the IGMP/MLD snooping logic or from user configuration.
> 
> While new memberships are immediately visible to walkers of
> br->mdb_list, the notification of their existence to switchdev event
> subscribers is deferred until a later point in time. So if a replay
> list was generated during a time that overlapped with such a window,
> it would also contain a replay of the not-yet-delivered event.
> 
> The driver would thus receive two copies of what the bridge internally
> considered to be one single event. On destruction of the bridge, only
> a single membership deletion event was therefore sent. As a
> consequence of this, drivers which reference count memberships (at
> least DSA), would be left with orphan groups in their hardware
> database when the bridge was destroyed.
> 
> This is only an issue when replaying additions. While deletion events
> may still be pending on the deferred queue, they will already have
> been removed from br->mdb_list, so no duplicates can be generated in
> that scenario.
> 
> To a user this meant that old group memberships, from a bridge in
> which a port was previously attached, could be reanimated (in
> hardware) when the port joined a new bridge, without the new bridge's
> knowledge.
> 
> For example, on an mv88e6xxx system, create a snooping bridge and
> immediately add a port to it:
> 
>     root@...ix-06-0b-00:~$ ip link add dev br0 up type bridge mcast_snooping 1 && \
>     > ip link set dev x3 up master br0
> 
> And then destroy the bridge:
> 
>     root@...ix-06-0b-00:~$ ip link del dev br0
>     root@...ix-06-0b-00:~$ mvls atu
>     ADDRESS             FID  STATE      Q  F  0  1  2  3  4  5  6  7  8  9  a
>     DEV:0 Marvell 88E6393X
>     33:33:00:00:00:6a     1  static     -  -  0  .  .  .  .  .  .  .  .  .  .
>     33:33:ff:87:e4:3f     1  static     -  -  0  .  .  .  .  .  .  .  .  .  .
>     ff:ff:ff:ff:ff:ff     1  static     -  -  0  1  2  3  4  5  6  7  8  9  a
>     root@...ix-06-0b-00:~$
> 
> The two IPv6 groups remain in the hardware database because the
> port (x3) is notified of the host's membership twice: once via the
> original event and once via a replay. Since only a single delete
> notification is sent, the count remains at 1 when the bridge is
> destroyed.
> 
> Then add the same port (or another port belonging to the same hardware
> domain) to a new bridge, this time with snooping disabled:
> 
>     root@...ix-06-0b-00:~$ ip link add dev br1 up type bridge mcast_snooping 0 && \
>     > ip link set dev x3 up master br1
> 
> All multicast, including the two IPv6 groups from br0, should now be
> flooded, according to the policy of br1. But instead the old
> memberships are still active in the hardware database, causing the
> switch to only forward traffic to those groups towards the CPU (port
> 0).
> 
> Eliminate the race in two steps:
> 
> 1. Grab the write-side lock of the MDB while generating the replay
>    list.
> 
> This prevents new memberships from showing up while we are generating
> the replay list. But it leaves the scenario in which a deferred event
> was already generated, but not delivered, before we grabbed the
> lock. Therefore:
> 
> 2. Make sure that no deferred version of a replay event is already
>    enqueued to the switchdev deferred queue, before adding it to the
>    replay list, when replaying additions.
> 
> Fixes: 4f2673b3a2b6 ("net: bridge: add helper to replay port and host-joined mdb entries")
> Signed-off-by: Tobias Waldekranz <tobias@...dekranz.com>
> ---

Excellent from my side, thank you!

Reviewed-by: Vladimir Oltean <olteanv@...il.com>

> @@ -307,6 +336,50 @@ int switchdev_port_obj_del(struct net_device *dev,
>  }
>  EXPORT_SYMBOL_GPL(switchdev_port_obj_del);
>  
> +/**
> + *	switchdev_port_obj_act_is_deferred - Is object action pending?
> + *
> + *	@dev: port device
> + *	@nt: type of action; add or delete
> + *	@obj: object to test
> + *
> + *	Returns true if a deferred item is exists, which is equivalent
> + *	to the action @nt of an object @obj.

nitpick: replace "is exists" with something else like "is pending" or
"exists".

Also "action of an object" or "on an object"?

> + *
> + *	rtnl_lock must be held.
> + */

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ