lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 27 Nov 2020 21:41:03 +0000
From:   Vladimir Oltean <vladimir.oltean@....com>
To:     Florian Fainelli <f.fainelli@...il.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "andrew@...n.ch" <andrew@...n.ch>,
        "vivien.didelot@...il.com" <vivien.didelot@...il.com>
Subject: Re: [PATCH net] net: dsa: reference count the host mdb addresses

Hi Florian,

On Mon, Oct 19, 2020 at 07:11:47PM -0700, Florian Fainelli wrote:
> On 10/15/2020 2:27 PM, Vladimir Oltean wrote:
> > Currently any DSA switch that implements the multicast ops (properly,
> > that is) gets these errors after just sitting for a while, with at least
> > 2 ports bridged:
> >
> > [  286.013814] mscc_felix 0000:00:00.5 swp3: failed (err=-2) to del object (id=3)
> >
> > The reason has to do with this piece of code:
> >
> > 	netdev_for_each_lower_dev(dev, lower_dev, iter)
> > 		br_mdb_switchdev_host_port(dev, lower_dev, mp, type);
> >
> > called from:
> >
> > br_multicast_group_expired
> > -> br_multicast_host_leave
> >     -> br_mdb_notify
> >        -> br_mdb_switchdev_host
> >
> > Basically, that code is correct. It tells each switchdev port that the
> > host can leave that multicast group. But in the case of DSA, all user
> > ports are connected to the host through the same pipe. So, because DSA
> > translates a host MDB to a normal MDB on the CPU port, this means that
> > when all user ports leave a multicast group, DSA tries to remove it N
> > times from the CPU port.
> >
> > We should be reference-counting these addresses.
> >
> > Fixes: 5f4dbc50ce4d ("net: dsa: slave: Handle switchdev host mdb add/del")
> > Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
>
> This looks good to me, just one possible problem below:
>
> [snip]
>
> > +
> > +	a = kzalloc(sizeof(*a), GFP_KERNEL);
> > +	if (!a)
> > +		return -ENOMEM;
>
> I believe this needs to be GFP_ATOMIC if we are to follow
> net/bridge/br_mdb.c::br_mdb_notify which does its netlink messages
> allocations using GFP_ATOMIC. On a side note, it woul dbe very helpful if we
> could annotate all bridge functions accordingly with their context (BH,
> process, etc.).

We are not in atomic context here, since Andrew took care to use
SWITCHDEV_F_DEFER in br_mdb_switchdev_host_port(). Honestly, if he
wouldn't have done that, we would have been pretty toast anyway, since
we end up calling dsa_port_mdb_add(), which assumes it can sleep. And
deferring in DSA is super complicated, due to the fact that
SWITCHDEV_OBJ_ID_HOST_MDB is transactional.

So it's ok to use GFP_KERNEL, I'll leave this as-is when I respin.

Powered by blists - more mailing lists