lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 4 Apr 2021 19:24:53 +0100
From:   Matthew Wilcox <willy@...radead.org>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     syzbot <syzbot+dde0cc33951735441301@...kaller.appspotmail.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        syzkaller-bugs@...glegroups.com, viro@...iv.linux.org.uk,
        netdev@...r.kernel.org, tglx@...utronix.de, peterz@...radead.org,
        frederic@...nel.org
Subject: Re: Something is leaking RCU holds from interrupt context

On Sun, Apr 04, 2021 at 09:48:08AM -0700, Paul E. McKenney wrote:
> On Sun, Apr 04, 2021 at 11:24:57AM +0100, Matthew Wilcox wrote:
> > On Sat, Apr 03, 2021 at 09:15:17PM -0700, syzbot wrote:
> > > HEAD commit:    2bb25b3a Merge tag 'mips-fixes_5.12_3' of git://git.kernel..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1284cc31d00000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=78ef1d159159890
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=dde0cc33951735441301
> > > 
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > > 
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+dde0cc33951735441301@...kaller.appspotmail.com
> > > 
> > > WARNING: suspicious RCU usage
> > > 5.12.0-rc5-syzkaller #0 Not tainted
> > > -----------------------------
> > > kernel/sched/core.c:8294 Illegal context switch in RCU-bh read-side critical section!
> > > 
> > > other info that might help us debug this:
> > > 
> > > 
> > > rcu_scheduler_active = 2, debug_locks = 0
> > > no locks held by systemd-udevd/4825.
> > 
> > I think we have something that's taking the RCU read lock in
> > (soft?) interrupt context and not releasing it properly in all
> > situations.  This thread doesn't have any locks recorded, but
> > lock_is_held(&rcu_bh_lock_map) is true.
> > 
> > Is there some debugging code that could find this?  eg should
> > lockdep_softirq_end() check that rcu_bh_lock_map is not held?
> > (if it's taken in process context, then BHs can't run, so if it's
> > held at softirq exit, then there's definitely a problem).
> 
> Something like the (untested) patch below?

Maybe?  Will this tell us who took the lock?  I was really trying to
throw out a suggestion in the hope that somebody who knows this area
better than I do would tell me I was wrong.

> Please note that it does not make sense to also check for
> either rcu_lock_map or rcu_sched_lock_map because either of
> these might be held by the interrupted code.

Yes!  Although if we do it somewhere like tasklet_action_common(),
we could do something like:

+++ b/kernel/softirq.c
@@ -774,6 +774,7 @@ static void tasklet_action_common(struct softirq_action *a,
 
        while (list) {
                struct tasklet_struct *t = list;
+               unsigned long rcu_lockdep = rcu_get_lockdep_state();
 
                list = list->next;
 
@@ -790,6 +791,10 @@ static void tasklet_action_common(struct softirq_action *a,
                        }
                        tasklet_unlock(t);
                }
+               if (rcu_lockdep != rcu_get_lockdep_state()) {
+                       printk(something useful about t);
+                       RCU_LOCKDEP_WARN(... something else useful ...);
+               }
 
                local_irq_disable();

where rcu_get_lockdep_state() returns a bitmap of whether the four rcu
lockdep maps are held.

We might also need something similar in __do_softirq(), in case it's
not a tasklet that's the problem.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ