[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <85d99af3-a3ee-41dc-96df-0b9903a6f516@roeck-us.net>
Date: Tue, 25 Feb 2025 08:04:44 -0800
From: Guenter Roeck <linux@...ck-us.net>
To: George Cherian <george.cherian@...vell.com>
Cc: wim@...ux-watchdog.org, corbet@....net, linux-watchdog@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] drivers: watchdog: Add support for panic notifier
callback
On Tue, Feb 25, 2025 at 02:06:15PM +0000, George Cherian wrote:
> Watchdog is not turned off in kernel panic situation.
> In certain systems this might prevent the successful loading
> of kdump kernel. The kdump kernel might hit a watchdog reset
> while it is booting.
>
> To avoid such scenarios add a panic notifier call back function
> which can stop the watchdog. This provision can be enabled by
> passing watchdog.stop_on_panic=1 via kernel command-line parameter.
>
> Signed-off-by: George Cherian <george.cherian@...vell.com>
> ---
> Changelog:
> v1 -> v2
> - Remove the per driver flag setting option
> - Take the parameter via kernel command-line parameter to watchdog_core.
>
> v2 -> v3
> - Remove the helper function watchdog_stop_on_panic() from watchdog.h.
> - There are no users for this.
>
> drivers/watchdog/watchdog_core.c | 42 ++++++++++++++++++++++++++++++++
> include/linux/watchdog.h | 2 ++
> 2 files changed, 44 insertions(+)
>
> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
> index d46d8c8c01f2..8cbebe38b7dd 100644
> --- a/drivers/watchdog/watchdog_core.c
> +++ b/drivers/watchdog/watchdog_core.c
> @@ -34,6 +34,7 @@
> #include <linux/idr.h> /* For ida_* macros */
> #include <linux/err.h> /* For IS_ERR macros */
> #include <linux/of.h> /* For of_get_timeout_sec */
> +#include <linux/panic_notifier.h> /* For panic handler */
> #include <linux/suspend.h>
>
> #include "watchdog_core.h" /* For watchdog_dev_register/... */
> @@ -47,6 +48,9 @@ static int stop_on_reboot = -1;
> module_param(stop_on_reboot, int, 0444);
> MODULE_PARM_DESC(stop_on_reboot, "Stop watchdogs on reboot (0=keep watching, 1=stop)");
>
> +static int stop_on_panic = -1;
> +module_param(stop_on_panic, int, 0444);
This can now be bool.
> +MODULE_PARM_DESC(stop_on_panic, "Stop watchdogs on panic (0=keep watching, 1=stop)");
> /*
> * Deferred Registration infrastructure.
> *
> @@ -155,6 +159,23 @@ int watchdog_init_timeout(struct watchdog_device *wdd,
> }
> EXPORT_SYMBOL_GPL(watchdog_init_timeout);
>
> +static int watchdog_panic_notify(struct notifier_block *nb,
> + unsigned long action, void *data)
> +{
> + struct watchdog_device *wdd;
> +
> + wdd = container_of(nb, struct watchdog_device, panic_nb);
> + if (watchdog_active(wdd)) {
> + int ret;
> +
> + ret = wdd->ops->stop(wdd);
> + if (ret)
> + return NOTIFY_BAD;
> + }
> +
> + return NOTIFY_DONE;
> +}
> +
> static int watchdog_reboot_notifier(struct notifier_block *nb,
> unsigned long code, void *data)
> {
> @@ -299,6 +320,14 @@ static int ___watchdog_register_device(struct watchdog_device *wdd)
> clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status);
> }
>
> + /* Module parameter to force watchdog policy on panic. */
> + if (stop_on_panic != -1) {
> + if (stop_on_panic && !test_bit(WDOG_NO_WAY_OUT, &wdd->status))
> + set_bit(WDOG_STOP_ON_PANIC, &wdd->status);
> + else
> + clear_bit(WDOG_STOP_ON_PANIC, &wdd->status);
> + }
> +
No longer needed here. See below.
> if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) {
> if (!wdd->ops->stop)
> pr_warn("watchdog%d: stop_on_reboot not supported\n", wdd->id);
> @@ -334,6 +363,16 @@ static int ___watchdog_register_device(struct watchdog_device *wdd)
> wdd->id, ret);
> }
>
> + if (test_bit(WDOG_STOP_ON_PANIC, &wdd->status)) {
> + if (!wdd->ops->stop) {
> + pr_warn("watchdog%d: stop_on_panic not supported\n", wdd->id);
> + } else {
> + wdd->panic_nb.notifier_call = watchdog_panic_notify;
> + atomic_notifier_chain_register(&panic_notifier_list,
> + &wdd->panic_nb);
> + }
> + }
Simplify to
if (stop_on_panic) {
if (!wdd->ops->stop) {
pr_warn("watchdog%d: stop_on_panic not supported\n", wdd->id);
} else {
wdd->panic_nb.notifier_call = watchdog_panic_notify;
atomic_notifier_chain_register(&panic_notifier_list,
&wdd->panic_nb);
set_bit(WDOG_STOP_ON_PANIC, &wdd->status);
}
}
This also fixes the bug where the unregistration function is called
even if the notifier was not actually registered.
One thing I just realized is that we'll have to figure out if atomic
notifiers can be used here unconditionally. Unless I am missing
something, watchdog stop functions can sleep. Of course, sleeping
while panic isn't a good idea. That means we _may_ need a driver
flag indicating either that the stop function can sleep or that it
won't. If we need that, I suggest we add WDIOF_STOP_MAYSLEEP or
similar to the watchdog_info options field.
Thanks,
Guenter
Powered by blists - more mailing lists