lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 12 Apr 2021 11:36:45 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     "tip-bot2 for Paul E. McKenney" <tip-bot2@...utronix.de>,
        linux-tip-commits@...r.kernel.org,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Uladzislau Rezki <urezki@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Frederic Weisbecker <frederic@...nel.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [tip: core/rcu] softirq: Don't try waking ksoftirqd before it
 has been spawned

On Mon, Apr 12, 2021 at 04:16:55PM +0200, Thomas Gleixner wrote:
> On Sun, Apr 11 2021 at 13:43, tip-bot wrote:
> > The following commit has been merged into the core/rcu branch of tip:
> >
> > Commit-ID:     1c0c4bc1ceb580851b2d76fdef9712b3bdae134b
> > Gitweb:        https://git.kernel.org/tip/1c0c4bc1ceb580851b2d76fdef9712b3bdae134b
> > Author:        Paul E. McKenney <paulmck@...nel.org>
> > AuthorDate:    Fri, 12 Feb 2021 16:20:40 -08:00
> > Committer:     Paul E. McKenney <paulmck@...nel.org>
> > CommitterDate: Mon, 15 Mar 2021 13:51:48 -07:00
> >
> > softirq: Don't try waking ksoftirqd before it has been spawned
> >
> > If there is heavy softirq activity, the softirq system will attempt
> > to awaken ksoftirqd and will stop the traditional back-of-interrupt
> > softirq processing.  This is all well and good, but only if the
> > ksoftirqd kthreads already exist, which is not the case during early
> > boot, in which case the system hangs.
> >
> > One reproducer is as follows:
> >
> > tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_NO_HZ_IDLE=y CONFIG_HZ_PERIODIC=n" --bootargs "threadirqs=1" --trust-make
> >
> > This commit therefore adds a couple of existence checks for ksoftirqd
> > and forces back-of-interrupt softirq processing when ksoftirqd does not
> > yet exist.  With this change, the above test passes.
> 
> Color me confused. I did not follow the discussion around this
> completely, but wasn't it agreed on that this rcu torture muck can wait
> until the threads are brought up?

Yes, we can cause rcutorture to wait.  But in this case, rcutorture
is just the messenger, and making it wait would simply be ignoring
the message.  The message is that someone could invoke any number of
things that wait on a softirq handler's invocation during the interval
before ksoftirqd has been spawned.

We looked at spawning the ksoftirq kthreads earlier, but that just
narrows the window -- it doesn't eliminate the problem.

We considered adding a check for this condition in order to yell at
people who invoke things that rely heavily on softirq during this time,
but there are perfectly legitimate use cases where it is OK for the
softirq handlers to just sit there until ksoftirqd is spawned.  The
problem isn't doing a raise_softirq(), but instead waiting on the
corresponding handler to complete.

We didn't see any reasonable false-positive-free way to create a reliable
diagnostic for that case, possibly due to a lack of imagination on
our part.

Ideas?

> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index 9908ec4..bad14ca 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -211,7 +211,7 @@ static inline void invoke_softirq(void)
> >  	if (ksoftirqd_running(local_softirq_pending()))
> >  		return;
> >  
> > -	if (!force_irqthreads) {
> > +	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
> >  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
> >  		/*
> >  		 * We can safely execute softirq on the current stack if
> 
> This still breaks RT which forces force_irqthreads to a compile time
> const which makes the compiler optimize out the direct invocation.
> 
> Surely RT can work around that, but how is that rcu torture muck
> supposed to work then? We're back to square one then.

Ah.  So RT relies on softirq handlers never ever being directly invoked,
even during boot time?  I was not aware of that.

OK, I will bite...  What are the RT workarounds for this case?  Maybe
they apply more generally.

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ