lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Oct 2020 19:11:47 -0700
From:   Florian Fainelli <f.fainelli@...il.com>
To:     Vladimir Oltean <vladimir.oltean@....com>, netdev@...r.kernel.org
Cc:     kuba@...nel.org, andrew@...n.ch, vivien.didelot@...il.com
Subject: Re: [PATCH net] net: dsa: reference count the host mdb addresses



On 10/15/2020 2:27 PM, Vladimir Oltean wrote:
> Currently any DSA switch that implements the multicast ops (properly,
> that is) gets these errors after just sitting for a while, with at least
> 2 ports bridged:
> 
> [  286.013814] mscc_felix 0000:00:00.5 swp3: failed (err=-2) to del object (id=3)
> 
> The reason has to do with this piece of code:
> 
> 	netdev_for_each_lower_dev(dev, lower_dev, iter)
> 		br_mdb_switchdev_host_port(dev, lower_dev, mp, type);
> 
> called from:
> 
> br_multicast_group_expired
> -> br_multicast_host_leave
>     -> br_mdb_notify
>        -> br_mdb_switchdev_host
> 
> Basically, that code is correct. It tells each switchdev port that the
> host can leave that multicast group. But in the case of DSA, all user
> ports are connected to the host through the same pipe. So, because DSA
> translates a host MDB to a normal MDB on the CPU port, this means that
> when all user ports leave a multicast group, DSA tries to remove it N
> times from the CPU port.
> 
> We should be reference-counting these addresses.
> 
> Fixes: 5f4dbc50ce4d ("net: dsa: slave: Handle switchdev host mdb add/del")
> Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>

This looks good to me, just one possible problem below:

[snip]

> +
> +	a = kzalloc(sizeof(*a), GFP_KERNEL);
> +	if (!a)
> +		return -ENOMEM;

I believe this needs to be GFP_ATOMIC if we are to follow 
net/bridge/br_mdb.c::br_mdb_notify which does its netlink messages 
allocations using GFP_ATOMIC. On a side note, it woul dbe very helpful 
if we could annotate all bridge functions accordingly with their context 
(BH, process, etc.).
-- 
Florian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ