lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=WttSbhpct2bhwF+EHgisSZ2G+6G6chNebosmAUtmv9Hw@mail.gmail.com>
Date: Mon, 6 Oct 2025 13:13:49 -0700
From: Doug Anderson <dianders@...omium.org>
To: Jinchao Wang <wangjinchao600@...il.com>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>, Peter Zijlstra <peterz@...radead.org>, 
	Will Deacon <will@...nel.org>, Yunhui Cui <cuiyunhui@...edance.com>, akpm@...ux-foundation.org, 
	catalin.marinas@....com, maddy@...ux.ibm.com, mpe@...erman.id.au, 
	npiggin@...il.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, 
	dave.hansen@...ux.intel.com, hpa@...or.com, acme@...nel.org, 
	namhyung@...nel.org, mark.rutland@....com, alexander.shishkin@...ux.intel.com, 
	jolsa@...nel.org, adrian.hunter@...el.com, kan.liang@...ux.intel.com, 
	kees@...nel.org, masahiroy@...nel.org, aliceryhl@...gle.com, ojeda@...nel.org, 
	thomas.weissschuh@...utronix.de, xur@...gle.com, ruanjinjie@...wei.com, 
	gshan@...hat.com, maz@...nel.org, suzuki.poulose@....com, 
	zhanjie9@...ilicon.com, yangyicong@...ilicon.com, gautam@...ux.ibm.com, 
	arnd@...db.de, zhao.xichao@...o.com, rppt@...nel.org, lihuafei1@...wei.com, 
	coxu@...hat.com, jpoimboe@...nel.org, yaozhenguo1@...il.com, 
	luogengkun@...weicloud.com, max.kellermann@...os.com, tj@...nel.org, 
	yury.norov@...il.com, thorsten.blum@...ux.dev, x86@...nel.org, 
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linuxppc-dev@...ts.ozlabs.org, linux-perf-users@...r.kernel.org, 
	Ian Rogers <irogers@...gle.com>
Subject: Re: [RFC PATCH V1] watchdog: Add boot-time selection for hard lockup detector

Hi,

On Tue, Sep 16, 2025 at 11:55 PM Jinchao Wang <wangjinchao600@...il.com> wrote:
>
> On Wed, Sep 17, 2025 at 08:08:57AM +0200, Christophe Leroy wrote:
> >
> >
> > Le 16/09/2025 à 16:50, Jinchao Wang a écrit :
> > > Currently, the hard lockup detector is selected at compile time via
> > > Kconfig, which requires a kernel rebuild to switch implementations.
> > > This is inflexible, especially on systems where a perf event may not
> > > be available or may be needed for other tasks.
> > >
> > > This commit refactors the hard lockup detector to replace a rigid
> > > compile-time choice with a flexible build-time and boot-time solution.
> > > The patch supports building the kernel with either detector
> > > independently, or with both. When both are built, a new boot parameter
> > > `hardlockup_detector="perf|buddy"` allows the selection at boot time.
> > > This is a more robust and user-friendly design.
> > >
> > > This patch is a follow-up to the discussion on the kernel mailing list
> > > regarding the preference and future of the hard lockup detectors. It
> > > implements a flexible solution that addresses the community's need to
> > > select an appropriate detector at boot time.
> > >
> > > The core changes are:
> > > - The `perf` and `buddy` watchdog implementations are separated into
> > >    distinct functions (e.g., `watchdog_perf_hardlockup_enable`).
> > > - Global function pointers are introduced (`watchdog_hardlockup_enable_ptr`)
> > >    to serve as a single API for the entire feature.
> > > - A new `hardlockup_detector=` boot parameter is added to allow the
> > >    user to select the desired detector at boot time.
> > > - The Kconfig options are simplified by removing the complex
> > >    `HARDLOCKUP_DETECTOR_PREFER_BUDDY` and allowing both detectors to be
> > >    built without mutual exclusion.
> > > - The weak stubs are updated to call the new function pointers,
> > >    centralizing the watchdog logic.
> > >
> > > Link: https://lore.kernel.org/all/20250915035355.10846-1-cuiyunhui@bytedance.com/
> > > Link: https://lore.kernel.org/all/CAD=FV=WWUiCi6bZCs_gseFpDDWNkuJMoL6XCftEo6W7q6jRCkg@mail.gmail.com/
> > >
> > > Signed-off-by: Jinchao Wang <wangjinchao600@...il.com>
> > > ---
> > >   .../admin-guide/kernel-parameters.txt         |  7 +++
> > >   include/linux/nmi.h                           |  6 +++
> > >   kernel/watchdog.c                             | 46 ++++++++++++++++++-
> > >   kernel/watchdog_buddy.c                       |  7 +--
> > >   kernel/watchdog_perf.c                        | 10 ++--
> > >   lib/Kconfig.debug                             | 37 +++++++--------
> > >   6 files changed, 85 insertions(+), 28 deletions(-)
> > >
> > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > index 5a7a83c411e9..0af214ee566c 100644
> > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > @@ -1828,6 +1828,13 @@
> > >                     backtraces on all cpus.
> > >                     Format: 0 | 1
> > > +   hardlockup_detector=
> > > +                   [perf, buddy] Selects the hard lockup detector to use at
> > > +                   boot time.
> > > +                   Format: <string>
> > > +                   - "perf": Use the perf-based detector.
> > > +                   - "buddy": Use the buddy-based detector.
> > > +
> > >     hash_pointers=
> > >                     [KNL,EARLY]
> > >                     By default, when pointers are printed to the console
> > > diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> > > index cf3c6ab408aa..9298980ce572 100644
> > > --- a/include/linux/nmi.h
> > > +++ b/include/linux/nmi.h
> > > @@ -100,6 +100,9 @@ void watchdog_hardlockup_check(unsigned int cpu, struct pt_regs *regs);
> > >   #endif
> > >   #if defined(CONFIG_HARDLOCKUP_DETECTOR_PERF)
> > > +void watchdog_perf_hardlockup_enable(unsigned int cpu);
> > > +void watchdog_perf_hardlockup_disable(unsigned int cpu);
> > > +extern int watchdog_perf_hardlockup_probe(void);
> >
> > No 'extern' on function prototypes, this is pointless.
> Got it.
> >
> > >   extern void hardlockup_detector_perf_stop(void);
> > >   extern void hardlockup_detector_perf_restart(void);
> > >   extern void hardlockup_config_perf_event(const char *str);
> > > @@ -120,6 +123,9 @@ void watchdog_hardlockup_disable(unsigned int cpu);
> > >   void lockup_detector_reconfigure(void);
> > >   #ifdef CONFIG_HARDLOCKUP_DETECTOR_BUDDY
> > > +void watchdog_buddy_hardlockup_enable(unsigned int cpu);
> > > +void watchdog_buddy_hardlockup_disable(unsigned int cpu);
> > > +int watchdog_buddy_hardlockup_probe(void);
> > >   void watchdog_buddy_check_hardlockup(int hrtimer_interrupts);
> > >   #else
> > >   static inline void watchdog_buddy_check_hardlockup(int hrtimer_interrupts) {}
> > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > > index 80b56c002c7f..85451d24a77d 100644
> > > --- a/kernel/watchdog.c
> > > +++ b/kernel/watchdog.c
> > > @@ -55,6 +55,37 @@ unsigned long *watchdog_cpumask_bits = cpumask_bits(&watchdog_cpumask);
> > >   #ifdef CONFIG_HARDLOCKUP_DETECTOR
> > > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_PERF
> > > +/* The global function pointers */
> > > +void (*watchdog_hardlockup_enable_ptr)(unsigned int cpu) = watchdog_perf_hardlockup_enable;
> > > +void (*watchdog_hardlockup_disable_ptr)(unsigned int cpu) = watchdog_perf_hardlockup_disable;
> > > +int (*watchdog_hardlockup_probe_ptr)(void) = watchdog_perf_hardlockup_probe;
> >
> > As this is set only once at startup, can we use static_call instead of
> > function pointers ?
> >
> > Also, can it me made __ro_after_init ?
> Not really, this is just an RFC patch, and there is no consensus yet.
> If it is included in the final consensus, I will handle it in the next version.
> >
> > > +#elif defined(CONFIG_HARDLOCKUP_DETECTOR_BUDDY)
> > > +void (*watchdog_hardlockup_enable_ptr)(unsigned int cpu) = watchdog_buddy_hardlockup_enable;
> > > +void (*watchdog_hardlockup_disable_ptr)(unsigned int cpu) = watchdog_buddy_hardlockup_disable;
> > > +int (*watchdog_hardlockup_probe_ptr)(void) = watchdog_buddy_hardlockup_probe;
> > > +#endif
> > > +
> > > +#ifdef CONFIG_HARDLOCKUP_DETECTOR_MULTIPLE
> > > +static char *hardlockup_detector_type = "perf"; /* Default to perf */

I'd still at least hope that "buddy" could be the default. While
"perf" can still be useful to catch some things that "buddy" can't,
"buddy" can also catch some things that perf can't so picking the
lower-overhead one as the default is perhaps better?


> I think we should first resolve the consensus issue:
> - Should we keep both perf and buddy watchdogs? (probably yes already)

Sounds like people want to keep perf, so "yes".


> - Should the watchdog type be changeable at boot time?

Seems like a good start.


> - Should the watchdog type be changeable at runtime?

If it was easy/possible then sure, but I don't think it's a giant deal
to have something like this only changeable at boot time, like your
patch does. The fact that your patch is pretty simple is definitely a
big win.


IMO it would be worth sending it out as a real patch.

-Doug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ