[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD=FV=WRzaLbLQ65usGeFq3ya=DV8cYyHQina_721EFoSTdBGA@mail.gmail.com>
Date: Wed, 7 Jun 2023 16:35:09 -0700
From: Doug Anderson <dianders@...omium.org>
To: Petr Mladek <pmladek@...e.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
kgdb-bugreport@...ts.sourceforge.net, linux-kernel@...r.kernel.org,
Nicholas Piggin <npiggin@...il.com>,
Michael Ellerman <mpe@...erman.id.au>,
linuxppc-dev@...ts.ozlabs.org,
Christophe Leroy <christophe.leroy@...roup.eu>,
sparclinux@...r.kernel.org,
"David S . Miller" <davem@...emloft.net>,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH 2/7] watchdog/hardlockup: Make the config checks more straightforward
Hi,
On Wed, Jun 7, 2023 at 8:25 AM Petr Mladek <pmladek@...e.com> wrote:
>
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 422f0ffa269e..13c6e596cf9e 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -404,17 +404,27 @@ config HAVE_NMI_WATCHDOG
> depends on HAVE_NMI
> bool
> help
> - The arch provides a low level NMI watchdog. It provides
> - asm/nmi.h, and defines its own watchdog_hardlockup_probe() and
> - arch_touch_nmi_watchdog().
> + The arch provides its own hardlockup detector implementation instead
> + of the generic perf one.
nit: did you mean to have different wording here compared to
HAVE_HARDLOCKUP_DETECTOR_ARCH? Here you say "the generic perf one" and
there you say "the generic ones", though it seems like you mean them
to be the same.
> +
> + Sparc64 defines this variable without HAVE_HARDLOCKUP_DETECTOR_ARCH.
> + It does _not_ use the command line parameters and sysctl interface
> + used by generic hardlockup detectors. Instead it is enabled/disabled
> + by the top-level watchdog interface that is common for both softlockup
> + and hardlockup detectors.
>
> config HAVE_HARDLOCKUP_DETECTOR_ARCH
> bool
> select HAVE_NMI_WATCHDOG
> help
> - The arch chooses to provide its own hardlockup detector, which is
> - a superset of the HAVE_NMI_WATCHDOG. It also conforms to config
> - interfaces and parameters provided by hardlockup detector subsystem.
> + The arch provides its own hardlockup detector implementation instead
> + of the generic ones.
> +
> + It uses the same command line parameters, and sysctl interface,
> + as the generic hardlockup detectors.
> +
> + HAVE_NMI_WATCHDOG is selected to build the code shared with
> + the sparc64 specific implementation.
>
> config HAVE_PERF_REGS
> bool
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 3e91fa33c7a0..d201f5d3876b 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1035,16 +1035,33 @@ config BOOTPARAM_SOFTLOCKUP_PANIC
>
> Say N if unsure.
>
> +config HAVE_HARDLOCKUP_DETECTOR_BUDDY
> + bool
> + depends on SMP
> + default y
I think you simplify your life if you also add:
depends on !HAVE_NMI_WATCHDOG
The existing config system has always assumed that no architecture
defines both HAVE_HARDLOCKUP_DETECTOR_PERF and HAVE_NMI_WATCHDOG
(symbols would have clashed and you would get a link error as two
watchdogs try to implement the same weak symbol). If you add the extra
dependency to "buddy" as per above, then a few things below fall out.
I'll try to point them out below.
> +
> #
> -# arch/ can define HAVE_HARDLOCKUP_DETECTOR_ARCH to provide their own hard
> -# lockup detector rather than the perf based detector.
> +# Global switch whether to build a hardlockup detector at all. It is available
> +# only when the architecture supports at least one implementation. There are
> +# two exceptions. The hardlockup detector is newer enabled on:
s/newer/never/
> +#
> +# s390: it reported many false positives there
> +#
> +# sparc64: has a custom implementation which is not using the common
> +# hardlockup command line options and sysctl interface.
> +#
> +# Note that HAVE_NMI_WATCHDOG is used to distinguish the sparc64 specific
> +# implementaion. It is automatically enabled also for other arch-specific
> +# variants which set HAVE_HARDLOCKUP_DETECTOR_ARCH. It makes the check
> +# of avaialable and supported variants quite tricky.
> #
> config HARDLOCKUP_DETECTOR
> bool "Detect Hard Lockups"
> depends on DEBUG_KERNEL && !S390
> - depends on HAVE_HARDLOCKUP_DETECTOR_NON_ARCH || HAVE_HARDLOCKUP_DETECTOR_ARCH
> + depends on ((HAVE_HARDLOCKUP_DETECTOR_PERF || HAVE_HARDLOCKUP_DETECTOR_BUDDY) && !HAVE_NMI_WATCHDOG) || HAVE_HARDLOCKUP_DETECTOR_ARCH
Adding the dependency to buddy (see ablove) would simplify the above
to just this:
depends on HAVE_HARDLOCKUP_DETECTOR_PERF ||
HAVE_HARDLOCKUP_DETECTOR_BUDDY || HAVE_HARDLOCKUP_DETECTOR_ARCH
As per above, it's simply a responsibility of architectures not to
define that they have both "perf" if they have the NMI watchdog, so
it's just buddy to worry about.
> + imply HARDLOCKUP_DETECTOR_PERF
> + imply HARDLOCKUP_DETECTOR_BUDDY
> select LOCKUP_DETECTOR
> - select HARDLOCKUP_DETECTOR_NON_ARCH if HAVE_HARDLOCKUP_DETECTOR_NON_ARCH
>
> help
> Say Y here to enable the kernel to act as a watchdog to detect
> @@ -1055,9 +1072,15 @@ config HARDLOCKUP_DETECTOR
> chance to run. The current stack trace is displayed upon detection
> and the system will stay locked up.
>
> +#
> +# Note that arch-specific variants are always preferred.
> +#
> config HARDLOCKUP_DETECTOR_PREFER_BUDDY
> bool "Prefer the buddy CPU hardlockup detector"
> - depends on HAVE_HARDLOCKUP_DETECTOR_PERF && SMP
> + depends on HARDLOCKUP_DETECTOR
> + depends on HAVE_HARDLOCKUP_DETECTOR_PERF && HAVE_HARDLOCKUP_DETECTOR_BUDDY
> + depends on !HAVE_NMI_WATCHDOG
Can get rid of above "!HAVE_NMI_WATCHDOG" if it's added to
HAVE_HARDLOCKUP_DETECTOR_BUDDY.
> + default n
I'm pretty sure "default n" isn't needed.
> help
> Say Y here to prefer the buddy hardlockup detector over the perf one.
>
> @@ -1071,39 +1094,27 @@ config HARDLOCKUP_DETECTOR_PREFER_BUDDY
>
> config HARDLOCKUP_DETECTOR_PERF
> bool
> - depends on HAVE_HARDLOCKUP_DETECTOR_PERF
> + depends on HARDLOCKUP_DETECTOR
> + depends on HAVE_HARDLOCKUP_DETECTOR_PERF && !HARDLOCKUP_DETECTOR_PREFER_BUDDY
> + depends on !HAVE_NMI_WATCHDOG
We don't really need the "!HAVE_NMI_WATCHDOG". A user wouldn't be able
to mess this up and it's up to an architecture not to define both
HAVE_HARDLOCKUP_DETECTOR_PERF and HAVE_NMI_WATCHDOG.
> select HARDLOCKUP_DETECTOR_COUNTS_HRTIMER
>
> config HARDLOCKUP_DETECTOR_BUDDY
> bool
> - depends on SMP
> + depends on HARDLOCKUP_DETECTOR
> + depends on HAVE_HARDLOCKUP_DETECTOR_BUDDY
> + depends on !HAVE_HARDLOCKUP_DETECTOR_PERF || HARDLOCKUP_DETECTOR_PREFER_BUDDY
> + depends on !HAVE_NMI_WATCHDOG
Get rid of "!HAVE_NMI_WATCHDOG" and it should be in
HAVE_HARDLOCKUP_DETECTOR_BUDDY
Overall, though, I agree that this improves things! Thanks! :-)
Powered by blists - more mailing lists