lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210219183336.GA23049@paulmck-ThinkPad-P72>
Date:   Fri, 19 Feb 2021 10:33:36 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Uladzislau Rezki <urezki@...il.com>
Cc:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Daniel Axtens <dja@...ens.net>,
        Frederic Weisbecker <frederic@...nel.org>,
        Neeraj Upadhyay <neeraju@...eaurora.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Michal Hocko <mhocko@...e.com>,
        "Theodore Y . Ts'o" <tytso@....edu>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...ymobile.com>
Subject: Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization

On Fri, Feb 19, 2021 at 10:18:11AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 19, 2021 at 12:27:51PM +0100, Uladzislau Rezki wrote:
> > On Fri, Feb 19, 2021 at 12:23:57PM +0100, Uladzislau Rezki wrote:
> > > On Fri, Feb 19, 2021 at 12:17:38PM +0100, Sebastian Andrzej Siewior wrote:
> > > > On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> > > > > I or Paul will ask for a test once it is settled down :) Looks like
> > > > > it is, so we should fix for v5.12.
> > > > 
> > > > Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
> > > > parts of it are also -stable material.
> 
> If Masami's patch works for the PowerPC guys on v5.10-rc7, then it can
> be backported.  The patch making RCU Tasks initialize itself early won't
> have any effect and can be left or reverted, as we choose.  The self-test
> patch will need to be either adjusted or reverted.
> 
> However...
> 
> The root cause of this problem is that softirq only kind-of works
> during a window of time during boot.  It works only if the number and
> duration of softirq handlers during this time is small enough, for some
> ill-defined notion of "small enough".  If there are too many, whatever
> that means exactly, then we get failed attempt to awaken ksoftirqd, which
> (sometimes!) results in a silent hang.  Which, as you pointed out earlier,
> is a really obnoxious error message.  And any minor change could kick
> us into silent-hang state because of the heuristics used to hand off
> to ksoftirqd.  The straw that broke the camel's back and all that.
> 
> One approach would be to add WARN_ON_ONCE() so that if softirq tries
> to awaken ksoftirqd before it is spawned, we get a nice obvious splat.
> Unfortunately, this gives false positives because there is code that
> needs a softirq handler to run eventually, but is OK with that handler
> being delayed until some random point in the early_initcall() sequence.
> 
> Besides which, if we are going to add a check, why not use that check
> just make things work by forcing handler execution to remain within the
> softirq back-of-interrupt context instead of awakening a not-yet-spawned
> ksoftirqd?  We can further prevent entry into dyntick-idle state until
> the ksoftirqd kthreads have been spawned, which means that if softirq
> handlers must be deferred, they will be resumed within one jiffy by the
> next scheduler-clock interrupt.
> 
> Yes, this can allow softirq handlers to impose large latencies, but only
> during early boot, long before any latency-sensitive applications can
> possibly have been created.  So this does not seem like a real problem.
> 
> Am I missing something here?

For definiteness, here is the first part of the change, posted earlier.
The commit log needs to be updated.  I will post the change that keeps
the tick going as a reply to this email.

							Thanx, Paul

------------------------------------------------------------------------

commit 4f659bf04fc4610523544493d6db92fc8670b086
Author: Paul E. McKenney <paulmck@...nel.org>
Date:   Fri Feb 12 16:20:40 2021 -0800

    softirq: Don't try waking ksoftirqd before it has been spawned
    
    If there is heavy softirq activity, the softirq system will attempt
    to awaken ksoftirqd and will stop the traditional back-of-interrupt
    softirq processing.  This is all well and good, but only if the
    ksoftirqd kthreads already exist, which is not the case during early
    boot, in which case the system hangs.
    
    One reproducer is as follows:
    
    tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y" --bootargs "threadirqs=1" --trust-make
    
    This commit therefore adds a couple of existence checks for ksoftirqd
    and forces back-of-interrupt softirq processing when ksoftirqd does not
    yet exist.  With this change, the above test passes.
    
    Reported-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
    Reported-by: Uladzislau Rezki <urezki@...il.com>
    Signed-off-by: Paul E. McKenney <paulmck@...nel.org>

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 9d71046..ba78e63 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
 	if (ksoftirqd_running(local_softirq_pending()))
 		return;
 
-	if (!force_irqthreads) {
+	if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
 #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
 		/*
 		 * We can safely execute softirq on the current stack if
@@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry __do_softirq(void)
 
 	pending = local_softirq_pending();
 	if (pending) {
-		if (time_before(jiffies, end) && !need_resched() &&
-		    --max_restart)
+		if (!__this_cpu_read(ksoftirqd) ||
+		    (time_before(jiffies, end) && !need_resched() && --max_restart))
 			goto restart;
 
 		wakeup_softirqd();

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ