lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0779b400-b99e-4fa2-8b18-de06fb4e77cc@kernel.dk>
Date: Wed, 21 May 2025 07:05:09 -0600
From: Jens Axboe <axboe@...nel.dk>
To: paulmck@...nel.org
Cc: John Ogness <john.ogness@...utronix.de>,
 LKML <linux-kernel@...r.kernel.org>, Petr Mladek <pmladek@...e.com>,
 Steven Rostedt <rostedt@...dmis.org>,
 "senozhatsky@...omium.org" <senozhatsky@...omium.org>
Subject: Re: printk NMI splat on boot

On 5/21/25 12:06 AM, Paul E. McKenney wrote:
> On Tue, May 20, 2025 at 02:41:40PM -0600, Jens Axboe wrote:
>> On 5/20/25 2:18 PM, Jens Axboe wrote:
>>>> What values are you using for CONFIG_RCU_EXP_CPU_STALL_TIMEOUT and
>>>> CONFIG_RCU_CPU_STALL_TIMEOUT?
>>>
>>> CONFIG_RCU_CPU_STALL_TIMEOUT=21
>>> CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=2
>>
>> This was =20 btw, guess it could cut a bit too much...
> 
> Just confirming that setting CONFIG_RCU_EXP_CPU_STALL_TIMEOUT to two
> milliseconds is more than a bit on the aggressive side.  ;-)

Sorry guess I wasn't clear - I had pasted in =2, but the setting in my
config was =20.

> Setting it to 20 milliseconds is OK for smartphone-class devices, but
> to the best of my knowledge, setting it less than 21 seconds (as in
> 21,000 milliseconds) has not been tested on any other platform.
> 
>> Changed them to:
>>
>> CONFIG_RCU_CPU_STALL_TIMEOUT=100
>> CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
>>
>> and complaining is gone.
> 
> This makes it take the default, which in this case would be the specified
> CONFIG_RCU_CPU_STALL_TIMEOUT value of 100 seconds.  Which is an unusually
> long timeout -- mainline these days is 21 seconds and some distros still
> use the old value of 60 seconds.

IMHO the settings for these are very odd. Which I guess is fine for
debugging kind of infrastructure, but fairly nonsensical in any case.
But not really that important - looks like RCU_EXP_CPU_STALL_TIMEOUT has
a default of '0' so not sure how on earth I ended up with 20 in that
one. Most likely from not reading the help entry and hence setting it
similarly to RCU_CPU_STALL_TIMEOUT.

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ