lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 26 Feb 2014 16:55:27 -0800 From: Andy Lutomirski <luto@...capital.net> To: Greg KH <gregkh@...uxfoundation.org> Cc: Stefani Seibold <stefani@...bold.net>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>, Andi Kleen <ak@...ux.intel.com>, Andrea Arcangeli <aarcange@...hat.com>, John Stultz <john.stultz@...aro.org>, Pavel Emelyanov <xemul@...allels.com>, Cyrill Gorcunov <gorcunov@...nvz.org>, andriy.shevchenko@...ux.intel.com, Martin.Runge@...de-schwarz.com, Andreas.Brief@...de-schwarz.com Subject: Re: Final: Add 32 bit VDSO time function support Um. This code doesn't work. I'll send a patch. I can't speak towards how well it compiles in different configurations. I can't speak towards how well it compiles in different configurations. Also, vdso_fallback_gettime needs .cfi annotations, I think. I could probably dredge the required incantations from somewhere, but someone else may know how to do it. Once I patch it to work, your 32-bit code is considerably faster than the 64-bit case. It's enough faster that I suspect a bug. Dumping the in-memory shows some rather suspicious nops before the rdtsc instruction. I suspect that you've forgotten to run the 32-bit vdso through the alternatives code. The is a nasty bug: it will appear to work, but you'll see non-monotonic times on some SMP systems. In my configuration, with your patches, I get (64-bit): CLOCK_REALTIME: 100000000 loops in 2.07105s = 20.71 nsec / loop 100000000 loops in 2.06874s = 20.69 nsec / loop 100000000 loops in 2.29415s = 22.94 nsec / loop CLOCK_MONOTONIC: 100000000 loops in 2.06526s = 20.65 nsec / loop 100000000 loops in 2.10134s = 21.01 nsec / loop 100000000 loops in 2.10615s = 21.06 nsec / loop CLOCK_REALTIME_COARSE: 100000000 loops in 0.37440s = 3.74 nsec / loop [ 503.011756] perf samples too long (2550 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 100000000 loops in 0.37399s = 3.74 nsec / loop 100000000 loops in 0.38445s = 3.84 nsec / loop CLOCK_MONOTONIC_COARSE: 100000000 loops in 0.40238s = 4.02 nsec / loop 100000000 loops in 0.40939s = 4.09 nsec / loop 100000000 loops in 0.41152s = 4.12 nsec / loop Without the patches, I get: CLOCK_REALTIME: 100000000 loops in 2.07348s = 20.73 nsec / loop 100000000 loops in 2.07346s = 20.73 nsec / loop 100000000 loops in 2.06922s = 20.69 nsec / loop CLOCK_MONOTONIC: 100000000 loops in 1.98955s = 19.90 nsec / loop 100000000 loops in 1.98895s = 19.89 nsec / loop 100000000 loops in 1.98881s = 19.89 nsec / loop CLOCK_REALTIME_COARSE: 100000000 loops in 0.37462s = 3.75 nsec / loop 100000000 loops in 0.37460s = 3.75 nsec / loop 100000000 loops in 0.37428s = 3.74 nsec / loop CLOCK_MONOTONIC_COARSE: 100000000 loops in 0.40081s = 4.01 nsec / loop 100000000 loops in 0.39834s = 3.98 nsec / loop [ 36.706696] perf samples too long (2565 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 100000000 loops in 0.39949s = 3.99 nsec / loop This looks like a wash, except for CLOCK_MONOTONIC, which got a bit slower. I'll send a followup patch once the bugs are fixed that improves the timings to: CLOCK_REALTIME: 100000000 loops in 2.08621s = 20.86 nsec / loop 100000000 loops in 2.07122s = 20.71 nsec / loop 100000000 loops in 2.07089s = 20.71 nsec / loop CLOCK_MONOTONIC: 100000000 loops in 2.06831s = 20.68 nsec / loop 100000000 loops in 2.06862s = 20.69 nsec / loop 100000000 loops in 2.06195s = 20.62 nsec / loop CLOCK_REALTIME_COARSE: 100000000 loops in 0.37274s = 3.73 nsec / loop 100000000 loops in 0.37247s = 3.72 nsec / loop 100000000 loops in 0.37234s = 3.72 nsec / loop CLOCK_MONOTONIC_COARSE: 100000000 loops in 0.39944s = 3.99 nsec / loop 100000000 loops in 0.39940s = 3.99 nsec / loop 100000000 loops in 0.40054s = 4.01 nsec / loop I'm not quite sure that causes the remaining loss. Test code is here: https://gitorious.org/linux-test-utils/linux-clock-tests -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists