linux-kernel - Re: [patch 1/2] sched/debug: Change need_resched warnings to pr

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <54cda6c8-38b5-5a98-5296-df40369889b7@google.com>
Date: Tue, 7 Jan 2025 12:13:10 -0800 (PST)
From: David Rientjes <rientjes@...gle.com>
To: 99090633-b625-ff07-fcf8-500d71f9ae13@...gle.com
cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, 
    Juri Lelli <juri.lelli@...hat.com>, 
    Vincent Guittot <vincent.guittot@...aro.org>, 
    Dietmar Eggemann <dietmar.eggemann@....com>, 
    Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, 
    Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>, 
    linux-kernel@...r.kernel.org, 
    Madadi Vineeth Reddy <vineethr@...ux.ibm.com>
Subject: Re: [patch 1/2] sched/debug: Change need_resched warnings to
 pr_err

On Wed, 8 Jan 2025, Madadi Vineeth Reddy wrote:

> Hi David Rientjes,
> 
> On 07/01/25 02:09, David Rientjes wrote:
> > need_resched warnings, if enabled, are treated as WARNINGs.  If
> > kernel.panic_on_warn is enabled, then this causes a kernel panic.
> > 
> > It's highly unlikely that a panic is desired for these warnings, only a
> > stack trace is normally required to debug and resolve.
> > 
> > Thus, switch need_resched warnings to simply be a printk with an
> > associated stack trace so they are no longer in scope for panic_on_warn.
> > 
> > Signed-off-by: David Rientjes <rientjes@...gle.com>
> > ---
> >  kernel/sched/debug.c | 10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> > --- a/kernel/sched/debug.c
> > +++ b/kernel/sched/debug.c
> > @@ -1295,8 +1295,10 @@ void resched_latency_warn(int cpu, u64 latency)
> >  {
> >  	static DEFINE_RATELIMIT_STATE(latency_check_ratelimit, 60 * 60 * HZ, 1);
> >  
> > -	WARN(__ratelimit(&latency_check_ratelimit),
> > -	     "sched: CPU %d need_resched set for > %llu ns (%d ticks) "
> > -	     "without schedule\n",
> > -	     cpu, latency, cpu_rq(cpu)->ticks_without_resched);
> > +	if (likely(!__ratelimit(&latency_check_ratelimit)))
> > +		return;
> > +
> > +	pr_err("sched: CPU %d need_resched set for > %llu ns (%d ticks) without schedule\n",
> > +	       cpu, latency, cpu_rq(cpu)->ticks_without_resched);
> 
> LGTM. While this is an issue, it doesn't necessarily indicate a critical failure that would
> require the kernel to panic.
> 
> Nit: Would using pr_warn instead be too lenient in this case?
> 

Thanks!  I pondered the log level here for about five seconds, I'm 
indifferent to pr_err() or pr_warn() :)  Since the stack trace is the most 
critical element of the output here, imo, and it has its own log level, I 
didn't feel strongly for either err or warn.

> Reviewed-by: Madadi Vineeth Reddy <vineethr@...ux.ibm.com>
> 
> Thanks,
> Madadi Vineeth Reddy
> 
> > +	dump_stack();
> >  }
> 
>