lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a2YuoMJ654sJtzE4mJN7wdd4o5JtY8W7c9QocZX8JP6cw@mail.gmail.com>
Date:   Mon, 25 Jun 2018 15:42:54 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Andi Kleen <ak@...ux.intel.com>
Cc:     Jens Axboe <axboe@...nel.dk>, Jan Kara <jack@...e.cz>,
        Jeff Layton <jlayton@...hat.com>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        y2038 Mailman List <y2038@...ts.linaro.org>,
        Brian Foster <bfoster@...hat.com>,
        Miklos Szeredi <miklos@...redi.hu>,
        Pavel Tatashin <pasha.tatashin@...cle.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux FS-devel Mailing List <linux-fsdevel@...r.kernel.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Andi Kleen <andi.kleen@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Deepa Dinamani <deepa.kernel@...il.com>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        John Stultz <john.stultz@...aro.org>,
        Stephen Boyd <sboyd@...nel.org>
Subject: Re: [PATCH] vfs: replace current_kernel_time64 with ktime equivalent

On Wed, Jun 20, 2018 at 9:35 PM, Arnd Bergmann <arnd@...db.de> wrote:
> On Wed, Jun 20, 2018 at 6:19 PM, Andi Kleen <ak@...ux.intel.com> wrote:
>> Arnd Bergmann <arnd@...db.de> writes:
>>>
>>> To clarify: current_kernel_time() uses at most millisecond resolution rather
>>> than microsecond, as tkr_mono.xtime_nsec only gets updated during the
>>> timer tick.
>>
>> Ah you're right. I remember now: the motivation was to make sure there
>> is basically no overhead. In some setups the full gtod can be rather
>> slow, particularly if it falls back to some crappy timer.
>
> This means, we're probably fine with a compile-time option that
> distros can choose to enable depending on what classes of hardware
> they are targetting, like
>
> struct timespec64 current_time(struct inode *inode)
> {
>         struct timespec64 now;
>         u64 gran = inode->i_sb->s_time_gran;
>
>         if (IS_ENABLED(CONFIG_HIRES_INODE_TIMES) &&
>             gran <= NSEC_PER_JIFFY)
>                   ktime_get_real_ts64(&now);
>         else
>                   ktime_get_coarse_real_ts64(&now);
>
>         return timespec64_trunc(now, gran);
> }
>
> With that implementation, we could still let file systems choose
> to get coarse timestamps by tuning the granularity in the
> superblock s_time_gran, which would result in nice round
> tv_nsec values that represent the actual accuracy.

I've done some simple tests and found that on a variety of
x86, arm32 and arm64 CPUs, it takes between 70 and 100
CPU cycles to read the TSC and add it to the coarse
clock, e.g. on a 3.1GHz Ryzen, using the little test program
below:

vdso hires:   37.18ns
vdso coarse:    6.44ns
sysc hires: 161.62ns
sysc coarse: 133.87ns

On the same machine, it takes around 400ns (1240 cycles)
to write one byte into a tmpfs file with pwrite(). Adding 5% to
10% overhead for accurate timestamps would definitely be
noticed, so I guess we wouldn't enable that unconditionally,
but could do it as an opt-in mount option if someone had a
use case.

       Arnd

---
/* measure times for high-resolution clocksource access from userspace */
#include <stdio.h>
#include <time.h>
#include <unistd.h>
#include <stdbool.h>
#include <sys/syscall.h>

static int do_clock_gettime(clockid_t clkid, struct timespec *tp, bool vdso)
{
        if (vdso)
                return clock_gettime(clkid, tp);

        return syscall(__NR_clock_gettime, clkid, tp);
}

static int loop1sec(int clkid, bool vdso)
{
        int i;
        struct timespec t, start;

        do_clock_gettime(clkid, &start, vdso);
        i = 0;
        do {
                do_clock_gettime(clkid, &t, vdso);
                i++;
        } while (t.tv_sec == start.tv_sec || t.tv_nsec < start.tv_nsec);

        return i;
}

int main(void)
{
        printf("vdso hires:     %7.2fns\n", 1000000000.0 /
loop1sec(CLOCK_REALTIME, true));
        printf("vdso coarse:    %7.2fns\n", 1000000000.0 /
loop1sec(CLOCK_REALTIME_COARSE, true));
        printf("sysc hires:     %7.2fns\n", 1000000000.0 /
loop1sec(CLOCK_REALTIME, false));
        printf("sysc coarse:    %7.2fns\n", 1000000000.0 /
loop1sec(CLOCK_REALTIME_COARSE, false));

        return 0;
}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ