[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <929C8485-D7F2-470A-B704-8068540DC358@linaro.org>
Date: Wed, 15 Jun 2022 09:55:04 -0700
From: Adhemerval Zanella <adhemerval.zanella@...aro.org>
To: Arnd Bergmann <arnd@...nel.org>
Cc: musl@...ts.openwall.com, John Stultz <jstultz@...gle.com>,
Stephen Boyd <sboyd@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [musl] Question about musl's time() implementation in time.c
> On 15 Jun 2022, at 05:09, Arnd Bergmann <arnd@...nel.org> wrote:
>
> On Wed, Jun 15, 2022 at 1:28 AM Rich Felker <dalias@...c.org> wrote:
>> On Tue, Jun 14, 2022 at 11:11:32PM +0200, Arnd Bergmann wrote:
>>>
>>> The thing is that a lot of file systems would still behave the same way
>>> because they round times down to a filesystem specific resolution,
>>> often one microsecond or one second, while the kernel time accounting
>>> is in nanoseconds. There have been discussions about an interface
>>> to find out what the actual resolution on a given mount point is (similar
>>> to clock_getres), but that never made it in. The guarantees that you
>>> get from file systems at the moment are:
>>
>> It's normal that they may be rounded down the the filesystem timestamp
>> granularity. I thought what was going on here was worse.
>
> It gets rounded down twice: first down to the start of the current
> timer tick, which is at an arbitrary nanosecond value in the past 10ms,
> and then to the resolution of the file system. The result is that the
> file timestamp can point to a slightly earlier value, up to max(timer tick
> cycle, fs resolution) before the actual nanosecond value. We don't
> advertise the granule of the file system though, so I would expect
> this to be within the expected behavior.
>
>> OK, the time syscall doing the wrong thing here (using a different
>> clock that's not correctly ordered with respect to CLOCK_REALTIME)
>> seems to be the worst problem here -- if I'm understanding it right.
>> The filesystem issue might be a non-issue if it's truly equivalent to
>> just having coarser fs timestamp granularity, which is allowed.
>
> Adding the kernel timekeeping maintainers to Cc. I think this is a
> reasonable argument, but it goes against the current behavior.
>
> We have four implementations of the time() syscall that one would
> commonly encounter:
>
> - The kernel syscall, using (effectively) CLOCK_REALTIME_COARSE
> - The kernel vdso, using (effectively) CLOCK_REALTIME_COARSE
> - The glibc interface, calling __clock_gettime64(CLOCK_REALTIME_COARSE, ...)
> - The musl interface, calling __clock_gettime64(CLOCK_REALTIME, ...)
>
> So even if everyone agrees that the musl implementation is the
> correct one, I think both linux and glibc are more likely to stick with
> the traditional behavior to avoid breaking user space code such as the
> libc-test case that Zev brought up initially. At least Adhemerval's
> time() implementation in glibc[1] appears to have done this intentionally,
> while the Linux implementation has simply never changed this in an
> incompatible way since Linux-0.01 added time() and 0.99.13k added
> the high-resolution gettimeofday().
>
> Arnd
>
> [1] https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=0d56378349
Indeed I have changed glibc to be consistent on all architectures to mimic kernel
behavior time syscall and avoid this very issue. We did not have a consistent
implementation before, so glibc varied depending of architecture and kernel
version whether it uses CLOCK_REALTIME or CLOCK_REALTIME_COARSE.
If kernel does change to make time() use CLOCK_REALTIME, it would make
sense to make glibc __clock_gettime64 to use it as well. We will also need to
either disable time vDSO usage on x86 and powerpc or make kernel implementation
to use CLOCK_REALTIME as well.
Powered by blists - more mailing lists