linux-kernel - Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <de21f62b-c246-4ff7-9825-600fe19af28a@paulmck-laptop>
Date: Sun, 7 Dec 2025 20:57:02 -0800
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Borislav Petkov <bp@...en8.de>
Cc: iommu@...ts.linux.dev, Joerg Roedel <joro@...tes.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
	Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>,
	linux-kernel@...r.kernel.org
Subject: Re: amd iommu: rcu: INFO: rcu_preempt detected expedited stalls on
 CPUs/tasks: { 0-.... } 8 jiffies s: 113 root: 0x1/.

On Thu, Dec 04, 2025 at 12:42:52PM -0800, Paul E. McKenney wrote:
> On Thu, Dec 04, 2025 at 03:45:05PM +0100, Borislav Petkov wrote:
> > On Wed, Dec 03, 2025 at 09:16:37AM -0800, Paul E. McKenney wrote:
> > > Or to some value that works for you.  But if you are not looking to be
> > > an expedited RCU CPU stall-warning pioneer, yes, setting it to zero is
> > > a good approach.
> > > 
> > > If you would like to be a more sane pioneer, setting it to (say) 11000
> > > (or 11 seconds) could be appropriate.  But what fun is sanity?  ;-)
> > 
> > Oh, I have a lot of excitement even without RCU experiments. :-P
> 
> ;-) ;-) ;-)
> 
> > But if you need me to try things, lemme know.
> > 
> > For now I've simply reset the values as to what defconfig sets them to:
> > 
> > CONFIG_RCU_CPU_STALL_TIMEOUT=21
> > CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
> 
> Sounds good, and thank you!

But this turned out to be too simple to fail to address (famous last
words!).  Please see below.

							Thanx, Paul

------------------------------------------------------------------------

commit 9776d62e236bef99859e9067f540aa6f6683b432
Author: Paul E. McKenney <paulmck@...nel.org>
Date:   Sun Dec 7 20:49:35 2025 -0800

    rcu: Make expedited RCU CPU stall warnings detect stall-end races
    
    If an expedited RCU CPU stall ends just at the stall-warning timeout,
    the current code will print an expedited stall-warning message, but one
    that doesn't identify any CPUs or tasks causing the stall.  This is most
    likely to happen for short-timeout stalls, for example, the 20-millisecond
    timeouts that are sometimes used for small embedded devices.  Needless to
    say, these semi-empty stall-warning messages can be rather confusing.
    
    One option would be to suppress the stall-warning message entirely in
    this case, but the near-miss information can be quite valuable.
    
    This commit therefore detects this race condition and emits a "INFO:
    Expedited stall ended before state dump start" message to clarify matters.
    
    Reported-by: Borislav Petkov <bp@...en8.de>
    Signed-off-by: Paul E. McKenney <paulmck@...nel.org>

diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 6058a734090c1..fd02cd12b7980 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -589,7 +589,12 @@ static void synchronize_rcu_expedited_stall(unsigned long jiffies_start, unsigne
 	pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
 		j - jiffies_start, rcu_state.expedited_sequence, data_race(rnp_root->expmask),
 		".T"[!!data_race(rnp_root->exp_tasks)]);
-	if (ndetected) {
+	if (!ndetected) {
+		// This is invoked from the grace-period worker, so
+		// a new grace period cannot have started.  And if this
+		// worker were stalled, we would not get here.  ;-)
+		pr_err("INFO: Expedited stall ended before state dump start\n");
+	} else {
 		pr_err("blocking rcu_node structures (internal RCU debug):");
 		rcu_for_each_node_breadth_first(rnp) {
 			if (rnp == rnp_root)