lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 8 Sep 2020 19:46:05 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     David Ahern <dsahern@...il.com>
Cc:     "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Ben Greear <greearb@...delatech.com>
Subject: Re: [PATCH net] ipv6: avoid lockdep issue in fib6_del()

On Tue, Sep 8, 2020 at 7:41 PM David Ahern <dsahern@...il.com> wrote:
>
> On 9/8/20 11:06 AM, Eric Dumazet wrote:
> > On Tue, Sep 8, 2020 at 6:50 PM David Ahern <dsahern@...il.com> wrote:
> >>
> >> On 9/8/20 2:20 AM, Eric Dumazet wrote:
> >>> syzbot reported twice a lockdep issue in fib6_del() [1]
> >>> which I think is caused by net->ipv6.fib6_null_entry
> >>> having a NULL fib6_table pointer.
> >>>
> >>> fib6_del() already checks for fib6_null_entry special
> >>> case, we only need to return earlier.
> >>>
> >>> Bug seems to occur very rarely, I have thus chosen
> >>> a 'bug origin' that makes backports not too complex.
> >>>
> >>> [1]
> >>> WARNING: suspicious RCU usage
> >>> 5.9.0-rc4-syzkaller #0 Not tainted
> >>> -----------------------------
> >>> net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
> >>>
> >>> other info that might help us debug this:
> >>>
> >>> rcu_scheduler_active = 2, debug_locks = 1
> >>> 4 locks held by syz-executor.5/8095:
> >>>  #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
> >>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
> >>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
> >>>  #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
> >>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
> >>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
> >>>
> >>> stack backtrace:
> >>> CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
> >>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> >>> Call Trace:
> >>>  __dump_stack lib/dump_stack.c:77 [inline]
> >>>  dump_stack+0x198/0x1fd lib/dump_stack.c:118
> >>>  fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
> >>>  fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
> >>>  fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
> >>>  fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
> >>>  fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
> >>>  __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
> >>
> >> This is walking a table and __fib6_clean_all takes the lock for the
> >> table (and you can see that above), so puzzling how fib6_del can be
> >> called for an entry with NULL fib6_table.
> >
> > So you think the test for  (rt == net->ipv6.fib6_null_entry)
> > should be replaced by
> >
> > BUG_ON(rt == net->ipv6.fib6_null_entry); ?
> >
>
> BUG_ON does not seem right.

Yes, currently we return -ENOENT, which looks reasonable.


>
> Backing out to the callers, why does fib6_clean_node not catch that it
> is the root of the table and abort the walk or at least not try to
> remove the root? This might be related to the problem Ben has complained
> about many times.
>
> If syzbot has only triggered it a few times then I presume no reproducer.

No repro, only occurred twice...

Powered by blists - more mailing lists