lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 23 Apr 2015 23:37:13 +0200 (CEST)
From:	Thomas Gleixner <tglx@...utronix.de>
To:	Andi Kleen <ak@...ux.intel.com>
cc:	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	Don Zickus <dzickus@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Anton Blanchard <anton@...ba.org>,
	Michael Ellerman <mpe@...erman.id.au>,
	linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH v1] watchdog: Use a reference cycle counter to avoid
 scaling issues

On Thu, 23 Apr 2015, Andi Kleen wrote:
> On Thu, Apr 23, 2015 at 10:01:04PM +0200, Thomas Gleixner wrote:
> > On Thu, 23 Apr 2015, Alexander Shishkin wrote:
> > 
> > > The problem with using cycle counter for NMI watchdog is that its
> > > frequency changes with the corresponding core's frequency. This means
> > > that, in particular, if the core frequency scales up, watchdog NMI will
> > > arrive more frequently than what user requested through watchdog_thresh
> > > and also increasing the probability of setting off the hardlockup detector,
> > > because the corresponding hrtimer will keep firing at the same intervals
> > > regardless of the core frequency. And, if the core can turbo to up to 2.5x
> > > its base frequency (and therefore TSC) [1], we'll have the hrtimer and NMI
> > 
> > So you are saying that this M-5Y10 has a non-constant TSC again? You
> > really can't be serious about that.
> 
> The TSC is constant, but the maximum frequency can be >=2.5TSC, so the
> watchdog which uses cycles can have that much error.

That makes sense. I misinterpreted the (therefore TSC) above.

The nmi_watchdog then fires not once per watchdog_tresh seconds, it
fires once per watchdog_tresh * 400ms. So the hrtimer might not have a
chance to increment hrtimer_interrupts and the hardlockup detector
triggers.

But do we really need that whole calibration maze to deal with this?

Definitely not.

We can just detect the deviation in the callback itself:

       u64 now = ktime_get_mono_fast_ns();

       if (now - __this_cpu_read(nmi_timestamp) < period)
       	       return;

       __this_cpu_write(nmi_timestamp, now);

It's that simple.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ