[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230515103316.GG83892@hirez.programming.kicks-ass.net>
Date: Mon, 15 May 2023 12:33:16 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Song Liu <song@...nel.org>
Cc: linux-kernel@...r.kernel.org, kernel-team@...a.com,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH] watchdog: Prefer use "ref-cycles" for NMI watchdog
On Fri, May 12, 2023 at 09:43:48AM -0700, Song Liu wrote:
> On Fri, May 12, 2023 at 5:47 AM Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Tue, May 09, 2023 at 03:17:00PM -0700, Song Liu wrote:
> > > NMI watchdog permanently consumes one hardware counters per CPU on the
> > > system. For systems that use many hardware counters, this causes more
> > > aggressive time multiplexing of perf events.
> > >
> > > OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely
> > > used. Try use "ref-cycles" for the watchdog. If the CPU supports it, so
> > > that one more hardware counter is available to the user. If the CPU doesn't
> > > support "ref-cycles", fall back to "cycles".
> > >
> > > The downside of this change is that users of "ref-cycles" need to disable
> > > nmi_watchdog.
> >
> > Urgh..
> >
> > how about something like so instead; then you can use whatever event you
> > like...
>
> Configuring this at boot time is not ideal for our use case. Currently, we have
> some systems support ref-cycles and some don't. So this is one more kernel
> argument we need to make sure to get correctly. This also means we cannot
> change this setting without reboot.
You can still add the fallback (with a suitable pr_warn() that the
requested config is not valid or so).
> Another idea I have is to use sysctl kernel.nmi_watchdog, so we can change
> the event after boot. Would this work?
Yeah, I suppose you can also extend the thing to allow runtime changes
to the values, provided the NMI watchdog is disabled at the time or
somesuch.
> Btw, the limitation here (ref-cycles users need to disable NMI watchdog) comes
> from the limitation that the programmable counters cannot do ref-cycles. Is this
> something we may change (or already changed)?
I really don't know .. and if it's not in the SDM I probably couldn't
tell you anyway :/
Powered by blists - more mailing lists