lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f95d746-27d0-a7c9-9cff-0cc60b7c1c73@gmail.com>
Date:   Tue, 8 Sep 2020 11:41:13 -0600
From:   David Ahern <dsahern@...il.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     "David S . Miller" <davem@...emloft.net>,
        netdev <netdev@...r.kernel.org>,
        Eric Dumazet <eric.dumazet@...il.com>,
        Ben Greear <greearb@...delatech.com>
Subject: Re: [PATCH net] ipv6: avoid lockdep issue in fib6_del()

On 9/8/20 11:06 AM, Eric Dumazet wrote:
> On Tue, Sep 8, 2020 at 6:50 PM David Ahern <dsahern@...il.com> wrote:
>>
>> On 9/8/20 2:20 AM, Eric Dumazet wrote:
>>> syzbot reported twice a lockdep issue in fib6_del() [1]
>>> which I think is caused by net->ipv6.fib6_null_entry
>>> having a NULL fib6_table pointer.
>>>
>>> fib6_del() already checks for fib6_null_entry special
>>> case, we only need to return earlier.
>>>
>>> Bug seems to occur very rarely, I have thus chosen
>>> a 'bug origin' that makes backports not too complex.
>>>
>>> [1]
>>> WARNING: suspicious RCU usage
>>> 5.9.0-rc4-syzkaller #0 Not tainted
>>> -----------------------------
>>> net/ipv6/ip6_fib.c:1996 suspicious rcu_dereference_protected() usage!
>>>
>>> other info that might help us debug this:
>>>
>>> rcu_scheduler_active = 2, debug_locks = 1
>>> 4 locks held by syz-executor.5/8095:
>>>  #0: ffffffff8a7ea708 (rtnl_mutex){+.+.}-{3:3}, at: ppp_release+0x178/0x240 drivers/net/ppp/ppp_generic.c:401
>>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: spin_trylock_bh include/linux/spinlock.h:414 [inline]
>>>  #1: ffff88804c422dd8 (&net->ipv6.fib6_gc_lock){+.-.}-{2:2}, at: fib6_run_gc+0x21b/0x2d0 net/ipv6/ip6_fib.c:2312
>>>  #2: ffffffff89bd6a40 (rcu_read_lock){....}-{1:2}, at: __fib6_clean_all+0x0/0x290 net/ipv6/ip6_fib.c:2613
>>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:359 [inline]
>>>  #3: ffff8880a82e6430 (&tb->tb6_lock){+.-.}-{2:2}, at: __fib6_clean_all+0x107/0x290 net/ipv6/ip6_fib.c:2245
>>>
>>> stack backtrace:
>>> CPU: 1 PID: 8095 Comm: syz-executor.5 Not tainted 5.9.0-rc4-syzkaller #0
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x198/0x1fd lib/dump_stack.c:118
>>>  fib6_del+0x12b4/0x1630 net/ipv6/ip6_fib.c:1996
>>>  fib6_clean_node+0x39b/0x570 net/ipv6/ip6_fib.c:2180
>>>  fib6_walk_continue+0x4aa/0x8e0 net/ipv6/ip6_fib.c:2102
>>>  fib6_walk+0x182/0x370 net/ipv6/ip6_fib.c:2150
>>>  fib6_clean_tree+0xdb/0x120 net/ipv6/ip6_fib.c:2230
>>>  __fib6_clean_all+0x120/0x290 net/ipv6/ip6_fib.c:2246
>>
>> This is walking a table and __fib6_clean_all takes the lock for the
>> table (and you can see that above), so puzzling how fib6_del can be
>> called for an entry with NULL fib6_table.
> 
> So you think the test for  (rt == net->ipv6.fib6_null_entry)
> should be replaced by
> 
> BUG_ON(rt == net->ipv6.fib6_null_entry); ?
> 

BUG_ON does not seem right.

Backing out to the callers, why does fib6_clean_node not catch that it
is the root of the table and abort the walk or at least not try to
remove the root? This might be related to the problem Ben has complained
about many times.

If syzbot has only triggered it a few times then I presume no reproducer.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ