[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1706212122030.2152@nanos>
Date: Wed, 21 Jun 2017 21:59:34 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Andi Kleen <ak@...ux.intel.com>
cc: Kan Liang <kan.liang@...el.com>, linux-kernel@...r.kernel.org,
dzickus@...hat.com, mingo@...nel.org, akpm@...ux-foundation.org,
babu.moger@...cle.com, atomlin@...hat.com, prarit@...hat.com,
torvalds@...ux-foundation.org, peterz@...radead.org,
eranian@...gle.com, acme@...hat.com, stable@...r.kernel.org
Subject: Re: [PATCH V2] kernel/watchdog: fix spurious hard lockups
On Wed, 21 Jun 2017, Andi Kleen wrote:
> On Wed, Jun 21, 2017 at 05:12:06PM +0200, Thomas Gleixner wrote:
> > On Wed, 21 Jun 2017, kan.liang@...el.com wrote:
> > >
> > > #ifdef CONFIG_HARDLOCKUP_DETECTOR
> > > +/*
> > > + * The NMI watchdog relies on PERF_COUNT_HW_CPU_CYCLES event, which
> > > + * can tick faster than the measured CPU Frequency due to Turbo mode.
> > > + * That can lead to spurious timeouts.
> > > + * To workaround the issue, extending the period by 3 times.
> > > + */
> > > u64 hw_nmi_get_sample_period(int watchdog_thresh)
> > > {
> > > - return (u64)(cpu_khz) * 1000 * watchdog_thresh;
> > > + return (u64)(cpu_khz) * 1000 * watchdog_thresh * 3;
> >
> > The maximum turbo frequency of any given machine can be retrieved.
>
> Not reliably, e.g. not in virtualization. Also it would require
> model specific checks, so as soon as you have a new model and an
> old kernel it could still randomly fail.
And that's in no way an argument for breaking every existing setup which
relies on the way stuff works now. Lots of crap on new models does not work
with older kernels.
Fact is, this is a user visible change and people have pointed out, that it
will break their setups and expectations. So, no this is not going to
happen with just slapping a randomly chosen factor on it.
Fact is, that the watchdog works this way since it got implemented and it
really can sensibly argued that the way it works is correct.
The treshold is based on the non-turbo max frequency of the CPU. That's how
the period is calculated. And that makes sense in terms of frequency
scaling in either direction.
If your CPU is stuck for 1 second @2GHZ, then it wastes exactly the same
amount of cycles when it is stuck for 0.5 seconds @4Ghz. So the watchdog
can rightfully kick in after that and tell the world that crap is stuck.
If your newfangled machine triggers the watchdog after 0.2 seconds, then
you already have a mechanism to fix that. It's a user space interface after
all. There are lot of things which need to be adjusted with new machines
and if the boot default of 10 seconds is not sufficient, then something is
really wrong with these systems.
TBH, I rather whish the hard lockup watchdog treshhold would be
configurable in milliseconds rather than seconds to catch crap faster.
Thanks,
tglx
Powered by blists - more mailing lists