[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230512164056.8f1e4e23032f7f7f5cb69df0@linux-foundation.org>
Date: Fri, 12 May 2023 16:40:56 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Song Liu <song@...nel.org>
Cc: <linux-kernel@...r.kernel.org>, <kernel-team@...a.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] watchdog: Prefer use "ref-cycles" for NMI watchdog
On Tue, 9 May 2023 15:17:00 -0700 Song Liu <song@...nel.org> wrote:
> NMI watchdog permanently consumes one hardware counters per CPU on the
> system. For systems that use many hardware counters, this causes more
> aggressive time multiplexing of perf events.
>
> OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely
> used. Try use "ref-cycles" for the watchdog. If the CPU supports it, so
> that one more hardware counter is available to the user. If the CPU doesn't
> support "ref-cycles", fall back to "cycles".
>
> The downside of this change is that users of "ref-cycles" need to disable
> nmi_watchdog.
>
> ...
>
> @@ -286,6 +286,12 @@ int __init hardlockup_detector_perf_init(void)
> {
> int ret = hardlockup_detector_event_create();
>
> + if (ret) {
If we get here, hardlockup_detector_event_create() has sent a scary
pr_debug message.
> + /* Failed to create "ref-cycles", try "cycles" instead */
> + wd_hw_attr.config = PERF_COUNT_HW_CPU_CYCLES;
> + ret = hardlockup_detector_event_create();
So it would be good to emit a followup message here telling users that
things are OK. Or tell the user we're retrying with a different
counter, etc.
> + /* Failed to create "ref-cycles", try "cycles" instead */
> + wd_hw_attr.config = PERF_COUNT_HW_CPU_CYCLES;
> + ret = hardlockup_detector_event_create();
> + }
> +
> if (ret) {
> pr_info("Perf NMI watchdog permanently disabled\n");
> } else {
> --
> 2.34.1
Powered by blists - more mailing lists