[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrUH1tZcu0upkTV0gsXtpnf-BAY4vmWKnN19=ZXsbqWT8A@mail.gmail.com>
Date: Fri, 28 Feb 2014 18:00:34 -0800
From: Andy Lutomirski <luto@...capital.net>
To: Stefani Seibold <stefani@...bold.net>
Cc: Greg KH <gregkh@...uxfoundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
X86 ML <x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, Andi Kleen <ak@...ux.intel.com>,
Andrea Arcangeli <aarcange@...hat.com>,
John Stultz <john.stultz@...aro.org>,
Pavel Emelyanov <xemul@...allels.com>,
Cyrill Gorcunov <gorcunov@...nvz.org>,
andriy.shevchenko@...ux.intel.com, Martin.Runge@...de-schwarz.com,
Andreas.Brief@...de-schwarz.com
Subject: Re: Final: Add 32 bit VDSO time function support
On Thu, Feb 27, 2014 at 11:22 PM, Stefani Seibold <stefani@...bold.net> wrote:
> Am Mittwoch, den 26.02.2014, 16:55 -0800 schrieb Andy Lutomirski:
>>
>> Once I patch it to work, your 32-bit code is considerably faster than
>> the 64-bit case. It's enough faster that I suspect a bug. Dumping
>> the in-memory shows some rather suspicious nops before the rdtsc
>> instruction. I suspect that you've forgotten to run the 32-bit vdso
>> through the alternatives code. The is a nasty bug: it will appear to
>> work, but you'll see non-monotonic times on some SMP systems.
>>
>
> I didn't know this. My basic test case is a KVM which defaults to 1 cpu.
> Thanks for discovering the issue.
This leads to a potentially interesting question: is rdtsc_barrier()
actually necessary on UP? IIRC the point is that, if an
rdtsc_barrier(); rdtsc in one thread is "before" (in the sense of
being synchronized by some memory operation) an rdtsc_barrier(); rdtsc
in another thread, then the first rdtsc needs to return an earlier or
equal time to the second one.
I assume that no UP CPU is silly enough to execute two rdtsc
instructions out of order relative to each other in the absence of
barriers. So this is a nonissue on UP.
On the other hand, suppose that some code does:
volatile long x = *(something that's not in cache)
clock_gettime
I can imagine a modern CPU speculating far enough ahead that the rdtsc
happens *before* the cache miss. This won't cause visible
non-monotonicity as far as I can see, but it might annoy people who
try to benchmark their code.
Note: actually making this change might be a bit tricky. I don't know
if the alternatives code is smart enough.
--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists