lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 19 Oct 2018 11:39:23 -0700
From:   John Stultz <john.stultz@...aro.org>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Christopher Hall <christopher.s.hall@...el.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        linux-rt-users <linux-rt-users@...r.kernel.org>,
        jesus.sanchez-palencia@...el.com, gavin.hindman@...el.com,
        liam.r.girdwood@...el.com, Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: TSC to Mono-raw Drift

On Fri, Oct 19, 2018 at 11:34 AM, John Stultz <john.stultz@...aro.org> wrote:
> On Fri, Oct 19, 2018 at 8:25 AM, Thomas Gleixner <tglx@...utronix.de> wrote:
>> Christopher,
>>
>> Please Cc LKML on such issues in the future.
>>
>> On Mon, 15 Oct 2018, Christopher Hall wrote:
>>
>> Leaving context around for new readers:
>>
>>> Problem Statement:
>>>
>>> The TSC clocksource mult/shift values are derived from CPUID[15H], but the
>>> monotonic raw clock value is not equal to TSC in nominal nanoseconds, i.e.
>>> the timekeeping code is not accurately transforming TSC ticks to nominal
>>> nanoseconds based on CPUID[15H}.
>>>
>>> The included code calculates the drift between nominal TSC nanoseconds and
>>> the monotonic raw clock.
>>>
>>> Background:
>>>
>>> Starting with 6th generation Intel CPUs, the TSC is "phase locked" to the
>>> Always Running Timer (ART). The relation between TSC and ART is read from
>>> CPUID[15H]. Details of the TSC-ART relation are in the "Invariant
>>> Timekeeping" section of the SDM.
>>>
>>> CPUID[15H].ECX returns the nominal frequency of ART (or crystal frequency).
>>> CPU feature TSC_KNOWN_FREQ indicates that tsc_khz (tsc.c) is derived from
>>> CPUID[15H]. The calculation is in tsc.c:native_calibrate_tsc().
>>>
>>> When the TSC clocksource is selected, the timekeeping code uses mult/shift
>>> values to transform TSC into nanoseconds. The mult/shift value is determined
>>> using tsc_khz.
>>>
>>> Example Output:
>>>
>>> Running for 3 seconds trial 1
>>> Scaled TSC delta: 3000328845
>>> Monotonic raw delta: 3000329117
>>> Ran for 3 seconds with 272 ns skew
>>>
>>> Running for 3 seconds trial 2
>>> Scaled TSC delta: 3000295209
>>> Monotonic raw delta: 3000295482
>>> Ran for 3 seconds with 273 ns skew
>>>
>>> Running for 3 seconds trial 3
>>> Scaled TSC delta: 3000262870
>>> Monotonic raw delta: 3000263142
>>> Ran for 3 seconds with 272 ns skew
>>>
>>> Running for 300 seconds trial 4
>>> Scaled TSC delta: 300000281725
>>> Monotonic raw delta: 300000308905
>>> Ran for 300 seconds with 27180 ns skew
>>>
>>> The skew between tsc and monotonic raw is about 91 PPB.
>>>
>>> System Information:
>>>
>>> CPU model string: Intel(R) Core(TM) i5-6600 CPU @ 3.30GHz
>>> Kernel version tested: 4.14.71-rt44
>>>       NOTE: The skew seems to be insensitive to kernel version after
>>>               introduction of TSC_KNOWN_FREQ capability
>>>
>>> >From CPUID[15H]:
>>>       Time Stamp Counter/Core Crystal Clock Information (0x15):
>>>               TSC/clock ratio = 276/2
>>>               nominal core crystal clock = 24000000 Hz (table lookup)
>>>
>>> TSC kHz used to calculate mult/shift value: 3312000
>
> So, just to understand, your saying the problem that we calculate a
> tsc_khz value before calculating the mult/shift and the intermediate
> step is losing some precision?
>
> Or is the cause from something else?

The other potential cause here might be just that when we calculate
the mult/shift pair, we select a shift small enough that avoids the
multiplication from overflowing if we have a long timerval. So there
is liklely always some granularity error converting to mult/shift
pair.

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ