lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230105004501.1771332-13-paulmck@kernel.org>
Date:   Wed,  4 Jan 2023 16:45:01 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     rcu@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org, kernel-team@...a.com,
        rostedt@...dmis.org,
        "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        Paul McKenney <paulmck@...nel.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Zhouyi Zhou <zhouzhouyi@...il.com>, stable@...r.kernel.org,
        Davidlohr Bueso <dave@...olabs.net>
Subject: [PATCH rcu 7/7] torture: Fix hang during kthread shutdown phase

From: "Joel Fernandes (Google)" <joel@...lfernandes.org>

During rcutorture shutdown, the rcu_torture_cleanup() function calls
torture_cleanup_begin(), which sets the fullstop global variable to
FULLSTOP_RMMOD. This causes the rcutorture threads for readers and
fakewriters to exit all of their "while" loops and start shutting down.

They then call torture_kthread_stopping(), which in turn waits for
kthread_stop() to be called.  However, rcu_torture_cleanup() has
not yet called kthread_stop() on those threads, and before it gets a
chance to do so, multiple instances of torture_kthread_stopping() invoke
schedule_timeout_interruptible(1) in a tight loop.  Tracing confirms that
TIMER_SOFTIRQ can then continuously execute timer callbacks.  If that
TIMER_SOFTIRQ preempts the task executing rcu_torture_cleanup(), that
task might never invoke kthread_stop().

This commit improves this situation by increasing the timeout passed to
schedule_timeout_interruptible() from one jiffy to 1/20th of a second.
This change prevents TIMER_SOFTIRQ from monopolizing its CPU, thus
allowing rcu_torture_cleanup() to carry out the needed kthread_stop()
invocations.  Testing has shown 100 runs of TREE07 passing reliably,
as oppose to the tens-of-percent failure rates seen beforehand.

Cc: Paul McKenney <paulmck@...nel.org>
Cc: Frederic Weisbecker <fweisbec@...il.com>
Cc: Zhouyi Zhou <zhouzhouyi@...il.com>
Cc: <stable@...r.kernel.org> # 6.0.x
Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
Tested-by: Zhouyi Zhou <zhouzhouyi@...il.com>
Reviewed-by: Davidlohr Bueso <dave@...olabs.net>
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
---
 kernel/torture.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/torture.c b/kernel/torture.c
index 29afc62f2bfec..1a0519b836ac9 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -915,7 +915,7 @@ void torture_kthread_stopping(char *title)
 	VERBOSE_TOROUT_STRING(buf);
 	while (!kthread_should_stop()) {
 		torture_shutdown_absorb(title);
-		schedule_timeout_uninterruptible(1);
+		schedule_timeout_uninterruptible(HZ / 20);
 	}
 }
 EXPORT_SYMBOL_GPL(torture_kthread_stopping);
-- 
2.31.1.189.g2e36527f23

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ