lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Jun 2023 08:23:52 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     "'H. Peter Anvin'" <hpa@...or.com>,
        'Thomas Gleixner' <tglx@...utronix.de>,
        Muhammad Usama Anjum <usama.anjum@...labora.com>,
        Jonathan Corbet <corbet@....net>,
        Ingo Molnar <mingo@...hat.com>,
        "Borislav Petkov" <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        open list <linux-kernel@...r.kernel.org>,
        "Guilherme G. Piccoli" <gpiccoli@...lia.com>
CC:     Steven Noonan <steven@...inklabs.net>,
        "kernel@...labora.com" <kernel@...labora.com>
Subject: RE: Direct rdtsc call side-effect

From: H. Peter Anvin
> Sent: 05 June 2023 17:32
...
> The TSC is certainly not perfect; partly because, ironically enough, it
> was introduced just *before* out of order and power management entered
> the x86 world.

Another issue is that the crystal used for the cpu clock won't be
that accurate (in terms of ppm error rate), and will have significant
temperature drift.
OTOH the crystal in the traditional x86 motherboard 'clock' chip
is (meant to be) designed to have long term accuracy.
While reading the TSC is a lot faster there ought to have been
some kind of PLL to continuously adjust the measured TSC frequency
to keep synchronised with the timer chip.
(Instead kernels end up writing the drifted TSC based time back to
the timer chip during shutdown.) 

> It is no secret that it has been slow to catch up. It was easy to put a
> counter in; it is a *lot* harder to make it work in all the possible
> scenarios in the power-managed, out-of-order world.

That rather depends on what you mean by 'work' :-)

> It is one of my personal pet projects in the architecture work to push
> to get that last distance; we are not yet there.

For performance measurements possibly what you want is a simple
clock counter which is dependent on an a register.
So pretty much zero overhead but is guaranteed to happen after
some other instruction without really affecting the pipeline.

IIRC the x86 performance counters aren't dependent on anything
so they tend to execute much earlier than you want.
OTOH rdtsc is likely to be synchronising and affect what follows.
ISTR using rdtsc to wait for instructions to complete and then
the performance clock counter to see how long it took.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ