lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 25 Apr 2021 19:45:27 +0300
From:   Nikolay Aleksandrov <nikolay@...dia.com>
To:     Taehee Yoo <ap420073@...il.com>, davem@...emloft.net,
        kuba@...nel.org, dsahern@...nel.org, yoshfuji@...ux-ipv6.org,
        netdev@...r.kernel.org, j.vosburgh@...il.com, vfalico@...il.com,
        andy@...yhouse.net, roopa@...dia.com, ast@...nel.org,
        andriin@...com, daniel@...earbox.net, weiwan@...gle.com,
        cong.wang@...edance.com, bjorn@...nel.org,
        herbert@...dor.apana.org.au, bridge@...ts.linux-foundation.org
Subject: Re: [PATCH net 2/2] net: bridge: fix lockdep multicast_lock false
 positive splat

On 25/04/2021 18:57, Taehee Yoo wrote:
> multicast_lock is a per-interface(bridge) lock.
> This lock can be used recursively because interfaces can be used
> recursively. So, it should use spin_lock_nested() but it doesn't.
> So lockdep false positive splat occurred.
> 
> Some inline helper functions are added.
> These functions internally get 'subclass' variable, which is used as
> parameter of spin_lock_nested() and use spin_lock_nested() with a
> subclass parameter.
> 
> Test commands:
>     ip link add br0 type bridge
>     ip link add bond0 type bond
>     ip link add br1 type bridge
>     ip link set br0 master bond0
>     ip link set bond0 up
>     ip link set bond0 master br1
>     ip link set br0 up
>     ip link set br1 up
>     ip link set br0 type bridge mcast_router 1 mcast_querier 1
>     ip link set br1 type bridge mcast_querier 1 mcast_router 1
> 
> Splat looks like:
> ============================================
> WARNING: possible recursive locking detected
> 5.12.0-rc7+ #855 Not tainted
> --------------------------------------------
> kworker/5:1/56 is trying to acquire lock:
> ffff88810f833000 (&br->multicast_lock){+.-.}-{2:2}, at:
> br_multicast_rcv+0x1484/0x5280 [bridge]
> 
[snip]
> 
> Fixes: eb1d16414339 ("bridge: Add core IGMP snooping support")
> Signed-off-by: Taehee Yoo <ap420073@...il.com>
> ---
> 
> v2:
>  - No change
> 
>  net/bridge/br_mdb.c           |  12 +--
>  net/bridge/br_multicast.c     | 146 +++++++++++++++++++++-------------
>  net/bridge/br_multicast_eht.c |  18 +++--
>  net/bridge/br_private.h       |  48 +++++++++++
>  4 files changed, 158 insertions(+), 66 deletions(-)
> 

Hi Taehee,
Ugh.. that's just very ugly. :) The setup you've described above is by all means invalid, but
possible unfortunately. The bridge already checks if it's being added as a port to another
bridge, but not through multiple levels of indirection. These locks are completely unrelated
as they're in very different contexts (different devices).

At the very least please push the rcu_read_lock() calls in br_multicast_lock_rcu/_bh() 
as they're needed only to get the nest level for netdev_get_nest_level_rcu(), we don't need
them for the whole code paths (right ?), we could save a few lines in the process and
avoid confusion about the locking rules for those code paths.

I wish there was a better solution.

Thanks,
 Nik

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ