lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250802154645.52712449@gandalf.local.home>
Date: Sat, 2 Aug 2025 15:46:45 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org, Peter
 Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, Juri
 Lelli <juri.lelli@...hat.com>, Vincent Guittot
 <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
 Mel Gorman <mgorman@...e.de>, Tejun Heo <tj@...nel.org>, Valentin Schneider
 <vschneid@...hat.com>, Shrikanth Hegde <sshegde@...ux.ibm.com>
Subject: Re: [GIT PULL] Scheduler updates for v6.17

On Sat, 2 Aug 2025 11:43:40 -0700
Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> I'm not seeing why that would matter, since the seq count should
> become even at some point, but it does mean that the seqcount read
> loop looks like it's an endless kernel loop when it triggers. I don't
> see how that would make a difference, since the seqcount should become
> even on the writer side and the writers shouldn't be preempted and get
> some kind of priority inversion with a reader that doesn't go away,
> but *if* there is some bug in this area, maybe that config is why I'm
> seeing it and others aren't?
> 
> Any ideas, people?

You could try to enable function tracer and stop the trace with the patch
below and see where it happened.

 # echo function > /sys/kernel/tracing/current_tracer
 # echo 1 > /sys/kernel/tracing/tracing_on

After it happens you can take a look at:

  # cat /sys/kernel/tracing/trace

where it would have stopped at the soft lock up. Now the function tracer
will fill up the buffer quickly and it may only have a fraction of a second
worth of data, thus it will not have the locked up task, but it may give
you an idea of what is keeping it from getting out of the read_seq loop.

Note that the function tracer will have a noticeable impact on performance.
But it may open up the race window even wider.

-- Steve

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 80b56c002c7f..7ac934efd8af 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -795,6 +795,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
 #ifdef CONFIG_SYSFS
 		++softlockup_count;
 #endif
+		trace_printk("SOFT LOCK UP DETECTED\n");
+		tracing_off();
 
 		/*
 		 * Prevent multiple soft-lockup reports if one cpu is already

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ