lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <28d0ea70-db7a-40e7-aac9-86808320f252@pengutronix.de>
Date: Wed, 5 Mar 2025 12:39:27 +0100
From: Ahmad Fatoum <a.fatoum@...gutronix.de>
To: George Cherian <gcherian@...vell.com>,
 "linux@...ck-us.net" <linux@...ck-us.net>,
 "wim@...ux-watchdog.org" <wim@...ux-watchdog.org>,
 "jwerner@...omium.org" <jwerner@...omium.org>,
 "evanbenn@...omium.org" <evanbenn@...omium.org>,
 "kabel@...nel.org" <kabel@...nel.org>, "krzk@...nel.org" <krzk@...nel.org>,
 "mazziesaccount@...il.com" <mazziesaccount@...il.com>,
 "thomas.richard@...tlin.com" <thomas.richard@...tlin.com>,
 "lma@...omium.org" <lma@...omium.org>,
 "bleung@...omium.org" <bleung@...omium.org>,
 "support.opensource@...semi.com" <support.opensource@...semi.com>,
 "shawnguo@...nel.org" <shawnguo@...nel.org>,
 "s.hauer@...gutronix.de" <s.hauer@...gutronix.de>,
 "kernel@...gutronix.de" <kernel@...gutronix.de>,
 "festevam@...il.com" <festevam@...il.com>, "andy@...nel.org"
 <andy@...nel.org>, "paul@...pouillou.net" <paul@...pouillou.net>,
 "alexander.usyskin@...el.com" <alexander.usyskin@...el.com>,
 "andreas.werner@....de" <andreas.werner@....de>,
 "daniel@...ngy.jp" <daniel@...ngy.jp>,
 "romain.perier@...il.com" <romain.perier@...il.com>,
 "avifishman70@...il.com" <avifishman70@...il.com>,
 "tmaimon77@...il.com" <tmaimon77@...il.com>,
 "tali.perry1@...il.com" <tali.perry1@...il.com>,
 "venture@...gle.com" <venture@...gle.com>,
 "yuenn@...gle.com" <yuenn@...gle.com>,
 "benjaminfair@...gle.com" <benjaminfair@...gle.com>,
 "maddy@...ux.ibm.com" <maddy@...ux.ibm.com>,
 "mpe@...erman.id.au" <mpe@...erman.id.au>,
 "npiggin@...il.com" <npiggin@...il.com>,
 "christophe.leroy@...roup.eu" <christophe.leroy@...roup.eu>,
 "naveen@...nel.org" <naveen@...nel.org>,
 "mwalle@...nel.org" <mwalle@...nel.org>,
 "xingyu.wu@...rfivetech.com" <xingyu.wu@...rfivetech.com>,
 "ziv.xu@...rfivetech.com" <ziv.xu@...rfivetech.com>,
 "hayashi.kunihiko@...ionext.com" <hayashi.kunihiko@...ionext.com>,
 "mhiramat@...nel.org" <mhiramat@...nel.org>
Cc: "chrome-platform@...ts.linux.dev" <chrome-platform@...ts.linux.dev>,
 "linux-watchdog@...r.kernel.org" <linux-watchdog@...r.kernel.org>,
 "imx@...ts.linux.dev" <imx@...ts.linux.dev>,
 "patches@...nsource.cirrus.com" <patches@...nsource.cirrus.com>,
 "openbmc@...ts.ozlabs.org" <openbmc@...ts.ozlabs.org>,
 "linux-mips@...r.kernel.org" <linux-mips@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
 "linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>
Subject: Re: [EXTERNAL] Re: [PATCH v4 0/2] Add stop_on_panic support for
 watchdog

Hi George,

On 05.03.25 12:28, George Cherian wrote:
> Hi Ahmad,
>> Hi George,
>> On 05.03.25 11:10, George Cherian wrote:
>>> This series adds a new kernel command line option to watchdog core to
>>> stop the watchdog on panic. This is useul in certain systems which prevents
>>> successful loading of kdump kernel due to watchdog reset.
>>>
>>> Some of the watchdog drivers stop function could sleep. For such
>>> drivers the stop_on_panic is not valid as the notifier callback happens
>>> in atomic context. Introduce WDIOF_STOP_MAYSLEEP flag to watchdog_info
>>> options to indicate whether the stop function would sleep.
>>
>> Did you consider having a reset_on_panic instead, which sets a user-specified
>> timeout on panic? This would make the mechanism useful also for watchdogs
> 
> /proc/sys/kernel/panic already provides that support. You may echo a non-zero value 
> and the system tries for a soft reboot after those many seconds. But this doesn't happen 
> in case of a kdump kernel load after panic.

The timeout specified to the Watchdog reset_on_panic option would be programmed into
the active watchdogs and not be used to trigger a software-induced reboot.

>> that can't be disabled and would protect against system lock up: 
>> Consider a memory-corruption bug (perhaps externally via DMA), which partially
>> overwrites both main and kdump kernel. With a disabled watchdog, the system
>> may not be able to recover on its own.
> 
> Yes, that is the reason why the kernel command-line is optional and by default it is set to zero.
> So that in cases if you have a corrupted kdump kernel then watchdog kicks in.

The existing option isn't enough for the kdump kernel use case.
If we (i.e. you) are going to do something about it, wouldn't it be
better to have a solution that's applicable to a wider number of
watchdog devices?

>> If you did consider it, what made you decide against it?
> watchdog.stop_on_panic=1 is specifically for systems which can't boot a kdump kernel due to the fact 
> that the kdump kernel gets a watchdog reset while booting, may be due to a shorter watchdog time.
> For eg: a 32-bit watchdog down counter running at 1GHz.
> reset_on_panic can guarantee only the largest watchdog timeout supported by HW, 
> since there is no one to ping the watchdog.

If you are serious with the watchdog use, you'll want to use the watchdog to
monitor kernel startup as well. If the bootloader can set a watchdog timeout
just before starting the kernel and it doesn't expire before the kernel watchdog
driver takes over, why can't we do the same just before starting the dumpkernel?

Thanks,
Ahmad

 
>>
>> Thanks,
>> Ahmad
>>
>>>
>>>
>> Changelog:
>> v1 -> v2
>> - Remove the per driver flag setting option
>> - Take the parameter via kernel command-line parameter to watchdog_core.
>>
>> v2 -> v3
>> - Remove the helper function watchdog_stop_on_panic() from watchdog.h.
>> - There are no users for this. 
>>
>> v3 -> v4
>> - Since the panic notifier is in atomic context, watchdog functions
>>   which sleep can't be called. 
>> - Add an options flag WDIOF_STOP_MAYSLEEP to indicate whether stop
>>   function sleeps.
>> - Simplify the stop_on_panic kernel command line parsing.
>> - Enable the panic notiffier only if the watchdog stop function doesn't
>>   sleep
>>
>> George Cherian (2):
>>   watchdog: Add a new flag WDIOF_STOP_MAYSLEEP
>>   drivers: watchdog: Add support for panic notifier callback
> 
> - George


-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ