lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bedd5a84-34d9-a13b-4b0d-7188d6f8629f@hpe.com>
Date:   Fri, 29 Sep 2017 10:39:28 -0700
From:   Mike Travis <mike.travis@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...hat.com>, "H. Peter Anvin" <hpa@...or.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Bin Gao <bin.gao@...ux.intel.com>,
        Prarit Bhargava <prarit@...hat.com>,
        Dimitri Sivanich <dimitri.sivanich@....com>,
        Andrew Banman <andrew.banman@....com>,
        Russ Anderson <russ.anderson@....com>,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        kevin.b.stanton@...el.com
Subject: Re: [PATCH 0/4] x86/platform/UV: Update TSC support



On 9/29/2017 9:23 AM, Peter Zijlstra wrote:
> On Fri, Sep 29, 2017 at 08:19:22AM -0700, Mike Travis wrote:
>>> So I would still like to get clarification on how ART works (or likely
>>> doesn't) on your systems. I think for now its fairly prudent to kill
>>> detect_art() on UV.
>>
>> I tested with both detect_art enabled and disabled and didn't notice a
>> difference though I wasn't sure what test to run to verify whether it was
>> being used or not.  (I'd be glad to run some specific test if one can be
>> suggested?)  The num/denom setting for a 2100Mhz CPU was 168/2 if that
>> information helps?
> 
> While ART has a ratio to TSC, it too has an absolute relation to it.
> Given an ART time stamp we can compute a TSC value and vice versa, this
> allows correlating device timestamps (Network, Audio/Video etc..) with
> CPU time stamps.
> 
> Per detect_art() we have a single system wide offset, namely:
> 
>    rdmsrl(MSR_IA32_TSC_ADJUST, art_to_tsc_offset);
> 
> But you use TSC_ADJUST to sync between your cabinets, this cannot ever
> be right. The ART clock of the other cabinets (those that did not run
> detect_art) will have a different offset.
> 
> Currently there are only two device drivers that use ART:
> 
>    drivers/net/ethernet/intel/e1000e/ptp.c:        *system = convert_art_to_tsc(sys_cycles);
>    sound/pci/hda/hda_controller.c: *system = convert_art_to_tsc(tsc_counter);
> 
> Outside of that nobody cares, _for_now_.

I'm checking with the hardware/firmware designers but your mention of 
e1000e reminded me that I did see this but didn't quite connect the 
meaning.  If it's really a system wide constant, then yes we cannot 
provide a single value that would apply to all CPU's.

> 
> I'm not sure if there's a means for the CPU to read ART in order to test
> this correlation.
> 
> Intel SDM Vol 3B 17.17.4 speaks of 'K' with a footnote about TSC_ADJUST
> and the VMCS TSC fields. But basically both TSC and ART start at 0 on
> power on and given the frequency ratio 'K' is a known for native system
> agents.
> 
> Again, I would suggest killing detect_art() (and the setting of
> X86_FEATURE_ART) on UV systems until things are worked out. Also, given
> you have your own distributed clock, I'm thinking you use that on your
> own devices, obviating the immediate need for ART.
> 
>>> Also, while indeed not strictly required, that TSC_ADJUST==0 test on
>>> bootcpu is nice for consumer systems, BIOS did something 'weird' if that
>>> is not true. Is something like is_uv_system() available early enough?
>>
>> My previous version of the patches had me setting a flag that could be
>> checked by the tsc_sanitize_first_cpu() function and disable the requirement
>> of "TSC == 0 on socket 0" for any arch that specified it.
>> (And UV did set that flag.)
>>
>> But Thomas said it was "hackery" and that TSC being 0 on socket 0 was no
>> longer a requirement.  So I took it out for this version and made the "TSC
>> == 0 on socket 0" no longer the default for any arch.
> 
> That's where it comes from. But normal systems really _should_ have it
> at 0 and its a useful sanity check IMO. We really want to know when the
> BIOS does a funny behind our backs.
> 

How about a more generic flag, such as "multi_tsc_sync_sources"?  That 
could trigger both disabling the "TSC == 0 on socket 0" check as well as 
disabling X86_FEATURE_ART where appropriate?  Or I could clear the 
feature ART cap separately in the UV system init code if they are not 
really related?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ