lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 13 Mar 2023 17:19:16 -0400
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     John Stultz <jstultz@...gle.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>, peterz@...radead.org,
        mingo@...hat.com, linux-kernel@...r.kernel.org, sboyd@...nel.org,
        eranian@...gle.com, namhyung@...nel.org, ak@...ux.intel.com,
        adrian.hunter@...el.com
Subject: Re: [RFC PATCH V2 2/9] perf: Extend ABI to support post-processing
 monotonic raw conversion



On 2023-03-11 12:55 a.m., John Stultz wrote:
> On Thu, Mar 9, 2023 at 8:56 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>> On 2023-03-08 8:17 p.m., John Stultz wrote:
>>> So I spent a little bit of time today adding some trace_printks to the
>>> timekeeping code so I could record the actual TSC and timestamps being
>>> calculated from CLOCK_MONOTONIC_RAW.
>>>
>>> I did catch one error in the test code, which unfortunately I'm to blame for:
>>>   mid = start + (delta +(delta/2))/2; //round-closest
>>>
>>> That should be
>>>   mid = start + (delta +(2/2))/2  //round-closest
>>> or more simply
>>>   mid = start + (delta +1)/2; //round-closest
>>>
>>> Generalized rounding should be: (value + (DIV/2))/DIV), but I'm
>>> guessing with two as the divisor, my brain mixed it up and typed
>>> "delta". My apologies!
>>>
>>> With that fix, I'm seeing closer to ~500ns of error in the
>>> interpolation, just using the userland sampling.   Now, I've also
>>> disabled vsyscalls for this (otherwise I wouldn't be able to
>>> trace_printk), so the error likely would be higher than with
>>> vsyscalls.
>>>
>>> Now, part of the error is that:
>>>   start= rdtsc();
>>>   clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
>>>   end = rdtsc();
>>>
>>> Ends up looking like
>>>   start= rdtsc();
>>>   clock_gettime() {
>>>      now = rdtsc();
>>>      delta = now - last;
>>>      ns = (delta * mult) >> shift
>>> [~midpoint~]
>>>      ts->nsec = base_ns + ns;
>>>      ts->sec = base_sec;
>>>      normalize_ts(ts)
>>>   }
>>>   end = rdtsc();
>>>
>>> And so by taking the mid-point we're always a little skewed from where
>>> the tsc was actually read.  Looking at the data for my case the tsc
>>> read seems to be ~12% in, so you could instead try:
>>>
>>> delta = end - start;
>>> p12 = start + ((delta * 12) + (100/2))/100;
>>>
>>> With that adjustment, I'm seeing error around ~40ns.
>>>
>>> Mind giving that a try?
>>
>> I tried both the new mid and p12. The error becomes even larger.
>>
>> With new mid (start + (delta +1)/2), the error is now ~3800ns
>> With p12 adjustment, the error is ~6700ns.
>>
>>
>> Here is how I run the test.
>> $./time
>> $perf record -e cycles:upp --clockid monotonic_raw $some_workaround
>> $./time
>>
>> Here are some raw data.
>>
>> For the first ./time,
>> start: 961886196018
>> end: 961886215603
>> MONO_RAW: 341485848531
>>
>> For the second ./time,
>> start: 986870117783
>> end: 986870136152
>> MONO_RAW: 351495432044
>>
>> Here is the time generated from one PEBS record.
>> TSC: 968210217271
>> PEBS_MONO_RAW (calculated via kernel conversion information): 344019503072
>>
>> Using new mid (start + (delta +1)/2), the guessed PEBS_MONO_RAW is
>> 344019506897. The error is 3825ns.
>> Using p12 adjustment, the guessed PEBS_MONO_RAW is 344019509831.
>> The error is 6759ns
> 
> Huh. I dunno. That seems wild that the error increased.
> 
> Just in case something is going astray with the PEBS_MONO_RAW logic,
> can you apply the hack patch I was using to display the MONOTONIC_RAW
> values the kernel calculates?
>   https://github.com/johnstultz-work/linux-dev/commit/8d7896b078965b059ea5e8cc21841580557f6df6
> 
> It uses trace_printk, so you'll have to cat /sys/kernel/tracing/trace
> to get the output.
> 


$ ./time_3
start: 7358368893806 end: 7358368902944 delta: 9138
MONO_RAW: 2899739790738
MID: 7358368898375
P12: 7358368894903
$ sudo cat /sys/kernel/tracing/trace | grep time_3
          time_3-1443    [002] .....  2899.858936: ktime_get_raw_ts64:
JDB: timekeeping_get_delta cycle_now: 7358368897679
          time_3-1443    [002] .....  2899.858937: ktime_get_raw_ts64:
JDB: ktime_get_raw_ts64: 2899739790738

The error between MID and cycle_now is -696ns
The error between P12 and cycle_now is 2776ns

The time_3.c is attached.

Thanks,
Kan
View attachment "time_3.c" of type "text/plain" (847 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ