lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 28 Feb 2014 18:00:34 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Stefani Seibold <stefani@...bold.net>
Cc:	Greg KH <gregkh@...uxfoundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	X86 ML <x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <ak@...ux.intel.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	John Stultz <john.stultz@...aro.org>,
	Pavel Emelyanov <xemul@...allels.com>,
	Cyrill Gorcunov <gorcunov@...nvz.org>,
	andriy.shevchenko@...ux.intel.com, Martin.Runge@...de-schwarz.com,
	Andreas.Brief@...de-schwarz.com
Subject: Re: Final: Add 32 bit VDSO time function support

On Thu, Feb 27, 2014 at 11:22 PM, Stefani Seibold <stefani@...bold.net> wrote:
> Am Mittwoch, den 26.02.2014, 16:55 -0800 schrieb Andy Lutomirski:
>>
>> Once I patch it to work, your 32-bit code is considerably faster than
>> the 64-bit case.  It's enough faster that I suspect a bug.  Dumping
>> the in-memory shows some rather suspicious nops before the rdtsc
>> instruction.  I suspect that you've forgotten to run the 32-bit vdso
>> through the alternatives code.  The is a nasty bug: it will appear to
>> work, but you'll see non-monotonic times on some SMP systems.
>>
>
> I didn't know this. My basic test case is a KVM which defaults to 1 cpu.
> Thanks for discovering the issue.

This leads to a potentially interesting question: is rdtsc_barrier()
actually necessary on UP?  IIRC the point is that, if an
rdtsc_barrier(); rdtsc in one thread is "before" (in the sense of
being synchronized by some memory operation) an rdtsc_barrier(); rdtsc
in another thread, then the first rdtsc needs to return an earlier or
equal time to the second one.

I assume that no UP CPU is silly enough to execute two rdtsc
instructions out of order relative to each other in the absence of
barriers.  So this is a nonissue on UP.

On the other hand, suppose that some code does:

volatile long x = *(something that's not in cache)
clock_gettime

I can imagine a modern CPU speculating far enough ahead that the rdtsc
happens *before* the cache miss.  This won't cause visible
non-monotonicity as far as I can see, but it might annoy people who
try to benchmark their code.

Note: actually making this change might be a bit tricky.  I don't know
if the alternatives code is smart enough.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ