[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<PH8PR18MB53817EC09B918852B78DF3AAC5CB2@PH8PR18MB5381.namprd18.prod.outlook.com>
Date: Wed, 5 Mar 2025 11:28:40 +0000
From: George Cherian <gcherian@...vell.com>
To: Ahmad Fatoum <a.fatoum@...gutronix.de>,
"linux@...ck-us.net"
<linux@...ck-us.net>,
"wim@...ux-watchdog.org" <wim@...ux-watchdog.org>,
"jwerner@...omium.org" <jwerner@...omium.org>,
"evanbenn@...omium.org"
<evanbenn@...omium.org>,
"kabel@...nel.org" <kabel@...nel.org>,
"krzk@...nel.org" <krzk@...nel.org>,
"mazziesaccount@...il.com"
<mazziesaccount@...il.com>,
"thomas.richard@...tlin.com"
<thomas.richard@...tlin.com>,
"lma@...omium.org" <lma@...omium.org>,
"bleung@...omium.org" <bleung@...omium.org>,
"support.opensource@...semi.com"
<support.opensource@...semi.com>,
"shawnguo@...nel.org"
<shawnguo@...nel.org>,
"s.hauer@...gutronix.de" <s.hauer@...gutronix.de>,
"kernel@...gutronix.de" <kernel@...gutronix.de>,
"festevam@...il.com"
<festevam@...il.com>,
"andy@...nel.org" <andy@...nel.org>,
"paul@...pouillou.net" <paul@...pouillou.net>,
"alexander.usyskin@...el.com"
<alexander.usyskin@...el.com>,
"andreas.werner@....de"
<andreas.werner@....de>,
"daniel@...ngy.jp" <daniel@...ngy.jp>,
"romain.perier@...il.com" <romain.perier@...il.com>,
"avifishman70@...il.com"
<avifishman70@...il.com>,
"tmaimon77@...il.com" <tmaimon77@...il.com>,
"tali.perry1@...il.com" <tali.perry1@...il.com>,
"venture@...gle.com"
<venture@...gle.com>,
"yuenn@...gle.com" <yuenn@...gle.com>,
"benjaminfair@...gle.com" <benjaminfair@...gle.com>,
"maddy@...ux.ibm.com"
<maddy@...ux.ibm.com>,
"mpe@...erman.id.au" <mpe@...erman.id.au>,
"npiggin@...il.com" <npiggin@...il.com>,
"christophe.leroy@...roup.eu"
<christophe.leroy@...roup.eu>,
"naveen@...nel.org" <naveen@...nel.org>,
"mwalle@...nel.org" <mwalle@...nel.org>,
"xingyu.wu@...rfivetech.com"
<xingyu.wu@...rfivetech.com>,
"ziv.xu@...rfivetech.com"
<ziv.xu@...rfivetech.com>,
"hayashi.kunihiko@...ionext.com"
<hayashi.kunihiko@...ionext.com>,
"mhiramat@...nel.org" <mhiramat@...nel.org>
CC: "chrome-platform@...ts.linux.dev" <chrome-platform@...ts.linux.dev>,
"linux-watchdog@...r.kernel.org" <linux-watchdog@...r.kernel.org>,
"imx@...ts.linux.dev" <imx@...ts.linux.dev>,
"patches@...nsource.cirrus.com"
<patches@...nsource.cirrus.com>,
"openbmc@...ts.ozlabs.org"
<openbmc@...ts.ozlabs.org>,
"linux-mips@...r.kernel.org"
<linux-mips@...r.kernel.org>,
"linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>,
"linuxppc-dev@...ts.ozlabs.org"
<linuxppc-dev@...ts.ozlabs.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>
Subject: RE: [EXTERNAL] Re: [PATCH v4 0/2] Add stop_on_panic support for
watchdog
Hi Ahmad,
>Hi George,
> On 05.03.25 11:10, George Cherian wrote:
>> This series adds a new kernel command line option to watchdog core to
>> stop the watchdog on panic. This is useul in certain systems which prevents
>> successful loading of kdump kernel due to watchdog reset.
>>
>> Some of the watchdog drivers stop function could sleep. For such
>> drivers the stop_on_panic is not valid as the notifier callback happens
>> in atomic context. Introduce WDIOF_STOP_MAYSLEEP flag to watchdog_info
>> options to indicate whether the stop function would sleep.
>
>Did you consider having a reset_on_panic instead, which sets a user-specified
>timeout on panic? This would make the mechanism useful also for watchdogs
/proc/sys/kernel/panic already provides that support. You may echo a non-zero value
and the system tries for a soft reboot after those many seconds. But this doesn't happen
in case of a kdump kernel load after panic.
>that can't be disabled and would protect against system lock up:
>Consider a memory-corruption bug (perhaps externally via DMA), which partially
>overwrites both main and kdump kernel. With a disabled watchdog, the system
>may not be able to recover on its own.
Yes, that is the reason why the kernel command-line is optional and by default it is set to zero.
So that in cases if you have a corrupted kdump kernel then watchdog kicks in.
>
>If you did consider it, what made you decide against it?
watchdog.stop_on_panic=1 is specifically for systems which can't boot a kdump kernel due to the fact
that the kdump kernel gets a watchdog reset while booting, may be due to a shorter watchdog time.
For eg: a 32-bit watchdog down counter running at 1GHz.
reset_on_panic can guarantee only the largest watchdog timeout supported by HW,
since there is no one to ping the watchdog.
>
>Thanks,
>Ahmad
>
>>
>>
> Changelog:
> v1 -> v2
> - Remove the per driver flag setting option
> - Take the parameter via kernel command-line parameter to watchdog_core.
>
> v2 -> v3
> - Remove the helper function watchdog_stop_on_panic() from watchdog.h.
> - There are no users for this.
>
> v3 -> v4
> - Since the panic notifier is in atomic context, watchdog functions
> which sleep can't be called.
> - Add an options flag WDIOF_STOP_MAYSLEEP to indicate whether stop
> function sleeps.
> - Simplify the stop_on_panic kernel command line parsing.
> - Enable the panic notiffier only if the watchdog stop function doesn't
> sleep
>
> George Cherian (2):
> watchdog: Add a new flag WDIOF_STOP_MAYSLEEP
> drivers: watchdog: Add support for panic notifier callback
- George
Powered by blists - more mailing lists