[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ca040804-3b04-4852-af14-8be1d4becf3f@roeck-us.net>
Date: Wed, 26 Feb 2025 05:05:37 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: George Cherian <gcherian@...vell.com>
Cc: "wim@...ux-watchdog.org" <wim@...ux-watchdog.org>,
"corbet@....net" <corbet@....net>,
"linux-watchdog@...r.kernel.org" <linux-watchdog@...r.kernel.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [EXTERNAL] Re: [PATCH v3] drivers: watchdog: Add support for
panic notifier callback
On 2/26/25 01:14, George Cherian wrote:
>
>
> ________________________________________
> From: Guenter Roeck <groeck7@...il.com> on behalf of Guenter Roeck <linux@...ck-us.net>
> Sent: Tuesday, February 25, 2025 21:34
> To: George Cherian
> Cc: wim@...ux-watchdog.org; corbet@....net; linux-watchdog@...r.kernel.org; linux-doc@...r.kernel.org; linux-kernel@...r.kernel.org
> Subject: [EXTERNAL] Re: [PATCH v3] drivers: watchdog: Add support for panic notifier callback
>
>
>>> On Tue, Feb 25, 2025 at 02:06:15PM +0000, George Cherian wrote:
>>> Watchdog is not turned off in kernel panic situation.
>>> In certain systems this might prevent the successful loading
>>> of kdump kernel. The kdump kernel might hit a watchdog reset
>>> while it is booting.
>>>
>>> To avoid such scenarios add a panic notifier call back function
>>> which can stop the watchdog. This provision can be enabled by
>>> passing watchdog.stop_on_panic=1 via kernel command-line parameter.
>>>
> v> Signed-off-by: George Cherian <george.cherian@...vell.com>
>>> ---
>>> Changelog:
>>> v1 -> v2
>>> - Remove the per driver flag setting option
>>> - Take the parameter via kernel command-line parameter to watchdog_core.
>>>
>>> v2 -> v3
>>> - Remove the helper function watchdog_stop_on_panic() from watchdog.h.
>>> - There are no users for this.
>>>
>>> drivers/watchdog/watchdog_core.c | 42 ++++++++++++++++++++++++++++++++
>>> include/linux/watchdog.h | 2 ++
>>> 2 files changed, 44 insertions(+)
>>>
>>> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
>>> index d46d8c8c01f2..8cbebe38b7dd 100644
>>> --- a/drivers/watchdog/watchdog_core.c
>>> +++ b/drivers/watchdog/watchdog_core.c
>>> @@ -34,6 +34,7 @@
>>> #include <linux/idr.h> /* For ida_* macros */
>>> #include <linux/err.h> /* For IS_ERR macros */
>>> #include <linux/of.h> /* For of_get_timeout_sec */
>>> +#include <linux/panic_notifier.h> /* For panic handler */
>>> #include <linux/suspend.h>
>>>
>>> #include "watchdog_core.h" /* For watchdog_dev_register/... */
>>> @@ -47,6 +48,9 @@ static int stop_on_reboot = -1;
>>> module_param(stop_on_reboot, int, 0444);
>>> MODULE_PARM_DESC(stop_on_reboot, "Stop watchdogs on reboot (0=keep watching, 1=stop)");
>>>
>>> +static int stop_on_panic = -1;
>>> +module_param(stop_on_panic, int, 0444);
>
>> This can now be bool.
> Ack.
>>> +MODULE_PARM_DESC(stop_on_panic, "Stop watchdogs on panic (0=keep watching, 1=stop)");
>>> /*
>>> * Deferred Registration infrastructure.
>>> *
>>> @@ -155,6 +159,23 @@ int watchdog_init_timeout(struct watchdog_device *wdd,
>>> }
>>> EXPORT_SYMBOL_GPL(watchdog_init_timeout);
>>>
>>> +static int watchdog_panic_notify(struct notifier_block *nb,
>>> + unsigned long action, void *data)
>>> +{
>>> + struct watchdog_device *wdd;
>>> +
>>> + wdd = container_of(nb, struct watchdog_device, panic_nb);
>>> + if (watchdog_active(wdd)) {
>>> + int ret;
>>> +
>>> + ret = wdd->ops->stop(wdd);
>>> + if (ret)
>>> + return NOTIFY_BAD;
>>> + }
>>> +
>>> + return NOTIFY_DONE;
>>> +}
>>> +
>>> static int watchdog_reboot_notifier(struct notifier_block *nb,
>>> unsigned long code, void *data)
>>> {
>>> @@ -299,6 +320,14 @@ static int ___watchdog_register_device(struct watchdog_device *wdd)
>>> clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status);
>>> }
>>>
>>> + /* Module parameter to force watchdog policy on panic. */
>>> + if (stop_on_panic != -1) {
>>> + if (stop_on_panic && !test_bit(WDOG_NO_WAY_OUT, &wdd->status))
>>> + set_bit(WDOG_STOP_ON_PANIC, &wdd->status);
>>> + else
>>> + clear_bit(WDOG_STOP_ON_PANIC, &wdd->status);
>>> + }
>>> +
>
>> No longer needed here. See below.
>>
> Ack Got it.
>>> if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) {
>>> if (!wdd->ops->stop)
>>> pr_warn("watchdog%d: stop_on_reboot not supported\n", wdd->id);
>>> @@ -334,6 +363,16 @@ static int ___watchdog_register_device(struct watchdog_device *wdd)
>>> wdd->id, ret);
>>> }
>>>
>>> + if (test_bit(WDOG_STOP_ON_PANIC, &wdd->status)) {
>>> + if (!wdd->ops->stop) {
>>> + pr_warn("watchdog%d: stop_on_panic not supported\n", wdd->id);
>>> + } else {
>>> + wdd->panic_nb.notifier_call = watchdog_panic_notify;
>>> + atomic_notifier_chain_register(&panic_notifier_list,
>>> + &wdd->panic_nb);
>>> + }
>>> + }
>>
>> Simplify to
>> if (stop_on_panic) {
>> if (!wdd->ops->stop) {
>> pr_warn("watchdog%d: stop_on_panic not supported\n", wdd->id);
>> } else {
>> wdd->panic_nb.notifier_call = watchdog_panic_notify;
>> atomic_notifier_chain_register(&panic_notifier_list,
>> &wdd->panic_nb);
>> set_bit(WDOG_STOP_ON_PANIC, &wdd->status);
>> }
>> }
> Okay will update to this.
>
>> This also fixes the bug where the unregistration function is called
>> even if the notifier was not actually registered.
>
>> One thing I just realized is that we'll have to figure out if atomic
>> notifiers can be used here unconditionally. Unless I am missing
>> something, watchdog stop functions can sleep. Of course, sleeping
>> while panic isn't a good idea. That means we _may_ need a driver
>> flag indicating either that the stop function can sleep or that it
>> won't. If we need that, I suggest we add WDIOF_STOP_MAYSLEEP or
>> similar to the watchdog_info options field.
>
> Yes, that is correct there are certain .stop implementations which can sleep.
> I will add a new WDIOF_STOP_MAYSLEEP flag and enable the drivers with
> this new flag. Only those drivers which have non-sleeping stop function will
> be able to have this feature.
>
> I hope this is what you are expecting.
Yes, that would be great. Please add the flag in a separate patch.
Thanks,
Guenter
Powered by blists - more mailing lists