[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <503bc4c9-a022-4802-bad8-c8fb82f01dc3@quicinc.com>
Date: Tue, 14 Jan 2025 18:01:27 +0800
From: "Aiqun(Maria) Yu" <quic_aiquny@...cinc.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann
<dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall
<bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>,
Valentin Schneider
<vschneid@...hat.com>,
Ingo Molnar <mingo@...nel.org>, <linux-kernel@...r.kernel.org>,
<kernel@...cinc.com>
Subject: Re: [PATCH] sched: Use printk_deferred_once() instead of WARN_ONCE()
On 1/13/2025 9:16 PM, Peter Zijlstra wrote:
> On Mon, Jan 13, 2025 at 02:12:43PM +0100, Peter Zijlstra wrote:
>> On Fri, Dec 27, 2024 at 05:27:10PM +0800, Maria Yu wrote:
>>> A deadlock is observed when WARN_ONCE() uses printk() inside the scheduler
>>> logic. printk_deferred_once() is a WARN_ONCE() similar special printk
>>> facility for the scheduler to avoid unnecessary deadlocks.
>>
>> No, problem is with printk. Using delayed will make it so that you'll
>> never see the output if it dies.
printk_deferred() only defers the console print, while it still ensures
that the log buffer is logged immediately and correctly.
There may be a side effect of showing fewer logs in a corner case if the
kernel crashes before the first console_unlock is called. However,
printk_deferred() will have the entire sched warn log in the printk log
buffer, and the log buffer content is a common method to check the crash
root cause. Therefore, in my opinion, in scenarios where the
console-enabled printk is likely to crash, printk_deferred() will be
more useful since the entire log information can be captured in the log
buffer.
>>
>> Only use delayed for printk()s that are expected and non fatal.
>
> Also, if you trip WARN in scheduler, correct thing to do is fix WARN,
> not make WARN run 'better'. No WARN, no problem.
I agree that the real issue caused SCHED_WARN_ON() should be addressed.
The actual sched logic problem is also being handled separately.
However, it is still useful to use the correct method to print the
information needed for debugging. Therefore, this patch aims to fix the
debug method of SCHED_WARN_ON(). We've encountered various crash issues
that exhibited the similar deadlock problem, but with different
SCHED_WARN_ON() actually.
Anyway, we don't want SCHED_WARN_ON() to sometimes be a warning in the
log and other times cause a BUG_ON() behavior that crashes the device.
--
Thx and BRs,
Aiqun(Maria) Yu
Powered by blists - more mailing lists