lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Mar 2023 13:14:48 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     "Paul E. McKenney" <paulmck@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Waiman Long <longman@...hat.com>
CC:     <linux-kernel@...r.kernel.org>
Subject: Re: A couple of TSC questions

Hi, Paul

On Tue, Mar 21, 2023 at 04:23:28PM -0700, Paul E. McKenney wrote:
> Hello, Feng!
> 
> I hope that things are going well for you and yours!

Thanks!

> First, given that the kernel can now kick out HPET instea of TSC in
> response to clock skew, does it make sense to permit recalibration of
> the still used TSC against the marked-unstable HPET?

Yes, it makes sense to me. I don't know the detail of the case, if
the TSC frequency comes from CPUID info, a recalibration against a
third party HW timer like ACPI_PM should help here. 

A further thought is if there are really quite some case that the
CPUID-provided TSC frequency info is not accurate, then we may need
to enable the recalibration by default, and give a warning message
when detecting any mismatch. 

> Second, we are very occasionally running into console messages like this:
> 
> Measured 2 cycles TSC warp between CPUs, turning off TSC clock.
> 
> This comes from check_tsc_sync_source() and indicates that one CPU's
> TSC read produced a later time than a later read from some other CPU.
> I am beginning to suspect that these can be caused by unscheduled delays
> in the TSC synchronization code, but figured I should ask you if you have
> ever seen these.  And of course, if so, what the usual causes might be.

I haven't seen this error myself or got similar reports. Usually it
should be easy to detect once happened, as falling back to HPET
will trigger obvious performance degradation.

Could you give more detail about when and how it happens, and the
HW info like how many sockets the platform has. 

CC Thomas, Waiman, as they discussed simliar case here:
https://lore.kernel.org/lkml/87h76ew3sb.ffs@tglx/T/#md4d0a88fb708391654e78312ffa75b481690699f

Thanks,
Feng

> Thoughts?
> 
> 							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ