lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 31 Aug 2018 20:39:30 -0700
From:   Andy Lutomirski <luto@...nel.org>
To:     Matt Rickard <matt@...trans.com.au>,
        Florian Weimer <fweimer@...hat.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...nel.org>,
        Stephen Boyd <sboyd@...nel.org>,
        John Stultz <john.stultz@...aro.org>, X86 ML <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] x86/vdso: Handle clock_gettime(CLOCK_TAI) in vDSO

(Hi, Florian!)

On Fri, Aug 31, 2018 at 6:59 PM, Matt Rickard <matt@...trans.com.au> wrote:
> Process clock_gettime(CLOCK_TAI) in vDSO.
> This makes the call about as fast as CLOCK_REALTIME and CLOCK_MONOTONIC:
>
>   nanoseconds
>  before after clockname
>    ---- ----- ---------
>     233    87 CLOCK_TAI
>      96    93 CLOCK_REALTIME
>      88    87 CLOCK_MONOTONIC

Are you sure you did this right?  With the clocksource set to TSC
(which is the only reasonable choice unless KVM has seriously cleaned
up its act), with retpolines enabled, I get 24ns for CLOCK_MONOTONIC
without your patch and 32ns with your patch.  And there is indeed a
retpoline in the disassembled output:

  e5:   e8 07 00 00 00          callq  f1 <__vdso_clock_gettime+0x31>
  ea:   f3 90                   pause
  ec:   0f ae e8                lfence
  ef:   eb f9                   jmp    ea <__vdso_clock_gettime+0x2a>
  f1:   48 89 04 24             mov    %rax,(%rsp)
  f5:   c3                      retq

You're probably going to have to set -fno-jump-tables or do something
clever like adding a whole array of (seconds, nsec) in gtod and
indexing that array by the clock id.

Meanwhile, I wrote the following trivial patch to add a
__vdso_clock_gettime_monotonic export.  It runs in 21ns, and I suspect
that the speedup is even a bit bigger when cache-cold because it
avoids some branches.  What do you all think?  Florian, do you think
glibc would be willing to add some magic to turn
clock_gettime(CLOCK_MONOTONIC, t) into
__vdso_clock_gettime_monotonic(t) when CLOCK_MONOTONIC is a constant?

View attachment "vclock_gettime_monotonic.patch" of type "text/x-patch" (1107 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ