linux-kernel - Re: TSC to Mono-raw Drift

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALAqxLUKOYYm=bzn4owZC_UgWTqrVPiz2TSnLFm1RNak3tm8-g@mail.gmail.com>
Date:   Tue, 23 Oct 2018 11:31:00 -0700
From:   John Stultz <john.stultz@...aro.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Christopher Hall <christopher.s.hall@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        linux-rt-users <linux-rt-users@...r.kernel.org>,
        jesus.sanchez-palencia@...el.com,
        Gavin Hindman <gavin.hindman@...el.com>,
        liam.r.girdwood@...el.com, Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Miroslav Lichvar <mlichvar@...hat.com>
Subject: Re: TSC to Mono-raw Drift

On Fri, Oct 19, 2018 at 3:36 PM, John Stultz <john.stultz@...aro.org> wrote:
> On Fri, Oct 19, 2018 at 1:50 PM, Thomas Gleixner <tglx@...utronix.de> wrote:
>> John,
>>
>> On Fri, 19 Oct 2018, John Stultz wrote:
>>> On Fri, Oct 19, 2018 at 11:57 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
>>> > I don't think you need complex oscillation for that. The error is constant
>>> > and small enough that it is a fractional nanoseconds thing with an interval
>>> > <= 1s. So you can just add that in a regular interval. Due to it being
>>> > small you can't observe time jumping I think.
>>>
>>> Well, from the examples the trouble is we seem to be a bit fast,
>>> rather then slow.
>>> So we'll have to reduce mult by one, and rework the calculations, but
>>> maybe something like this (correcting the raw_interval value) would
>>> work.
>>
>> Shouldn't be rocket science. It's a one off calculation of adjustment value
>> and maybe the period at which the correction happens.
>>
>>> But this also sort of breaks, fundamental argument that the raw clock
>>> is a simple mult/shift transformation of the underlying clocksource
>>> counter. Its not the accuracy of the clock but the consistency that
>>> was key.
>>>
>>> The counter argument is that the raw clock is abstracting the
>>> underlying hardware so folks who would have used the TSC directly can
>>> now use the raw clock and have a generic abstracted hardware-counter
>>> interface. So userland shouldn't really be worried about the
>>> occasional injections made since they shouldn't be trying to
>>> re-generate the abstraction from the hardware themselves.  <--
>>> Remember this point as we move to the next comment:)
>>>
>>> > The end-result is 'correct' as much correct it is in relation to real
>>> > nanoseconds. :)
>>> >
>>> >> I guess I'd want to understand more of the use here and the need to
>>> >> tie the raw clock back to the hardware counter it abstracts.
>>> >
>>> > The problem there is ART which is distributed to PCIe devices and ART time
>>> > stamps are exposed in various ways. ART has a fixed ratio vs. TSC so there
>>> > is a reasonable expectation that MONOTONIC_RAW is accurate.
>>>
>>> Which is maybe sort of my issue here. The raw clock provided a
>>> abstraction away from the hardware for generic usage, but then its
>>> being re-used with other un-abstracted hardware references. So unless
>>> they use the same method of transformation, there will be problems (of
>>> varying degree).
>>
>> OTOH. If people use the CPUID provided frequency information and the TSC
>> from userspace then they get different results which is contrary to the
>> goal of providing them an abstracted way of doing it.
>
> But that's my point. If they are pulling time values from the hardware
> directly that's unabstracted. I'm not sure its smart to be comparing
> the abstracted and unabstracted time stamps if your worried about
> precision. They are sort of two separate (though similar) time
> domains.
>
>>> We might be able to reduce the degree in this case, but I worry the
>>> extra complexity may only cause problems for others.
>>
>> Is it really that complex to add a fixed correction value periodically?
>>
>> I don't think so and it should just work for any clocksource which is
>> exposed this way. Famous last words .....
>
> I'm not saying that the code is overly complex (at least compared to
> the rest of the timekeeping code :), but just how the accumulation is
> done is less-trivial. So if someone else is trying to mimic the
> abstracted time with unabstracted hardware values (again, not
> something I reccomend, but that's sort of the usage case pushing
> this), they need to use a similar method that is slightly more
> complicated (or use slower math). Its all subtle stuff, but this makes
> something that was relatively very simple (by design) a bit harder to
> explain.

Adding Mirosalv as he's always thoughtful on these sorts of issues.

I spent a little bit of time thinking this out. Unfortunately I don't
think its a simple matter of calculating the granularity error on the
raw clock and adding it in each interval. The other trouble spot is
that the adjusted clocks (monotonic/realtime) are adjusted off of that
raw clock. So they would need to have that error added as well,
otherwise the raw and a otherwise non-adjusted monotonic clock would
drift.

However, to be correct, the ntp adjustments made would have to be made
to both the base interval + error, which mucks the math up a fair bit.

Maybe Miroslav will have a better idea here, but otherwise I'll stew
on this a bit more and see what I can come up with.

thanks
-john