lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+55aFys9zddHAoKURGAeu0QYD_MC11uEyOQGU_4Ybsf200NkQ@mail.gmail.com>
Date:	Mon, 16 Jan 2012 16:18:46 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Suresh Siddha <suresh.b.siddha@...el.com>
Cc:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	asit.k.mallick@...el.com
Subject: Re: [patch] x86, tsc: fix SMI induced variation in quick_pit_calibrate()

On Mon, Jan 16, 2012 at 12:15 PM, Suresh Siddha
<suresh.b.siddha@...el.com> wrote:
> Linus, We are seeing NTP failures on a big cluster as a result of big
> variation in calibrated TSC values. Our debug showed that it is indeed
> because of the SMI and its effect on quick pit calibration. Appended
> patch helps fix it. It ran over the weekend boot tests with out any
> failures.

Ok, I think your patch is wrong.

HOWEVER.

I think it may be *close* to right.

So what happens is that right now we calculate the "deltatsc" over the
last PIT read - the one that returned the new MSB.

But what we *should* do is to calculate deltatsc over the *two* last
PIT reads - so that we have the time delay over seeing both the old
MSB _and_ seeign the new one. That's the true measure of how precisely
we caught the "MSB changed" thing, after all!

So I think your patch is a total hack that just compares the two last
TSC deltas, but it's actually close to the "correct' thing in that it
does start taking the time to see the last of the previous MSB into
account.

So this patch changes pit_expect_msb() so that

 - the 'tsc' is the TSC in between the two reads that read the MSB
change from the PIT (same as before)

 - the 'delta' is the difference in TSC from *before* the MSB changed
to *after* the MSB changed.

Now the delta is twice as big as before (it covers four PIT accesses,
roughly 4us), so the comments might have to be updated to match, but
the rest of the code should "just work" (except it might loop a bit
longer, and maybe it gives closer to 250 ppm precision).

Does this fix it for you? I have NOT tested it in any way.

                     Linus

View attachment "patch.diff" of type "text/x-patch" (686 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ