lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <036fab04-0c1b-134d-a170-399b2dc6ab5f@redhat.com>
Date:   Fri, 22 Feb 2019 13:31:15 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Thomas Gleixner <tglx@...utronix.de>, Olaf Hering <olaf@...fle.de>
Cc:     John Stultz <john.stultz@...aro.org>,
        Stephen Boyd <sboyd@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>, x86@...nel.org
Subject: Re: recalibrating x86 TSC during suspend/resume

On 22/02/19 12:44, Thomas Gleixner wrote:
>> The specific usecase I have is a workload within VMs that makes heavy
>> use of TSC. The kernel is booted with 'clocksource=tsc highres=off nohz=off'
>> because only this clocksource gives enough granularity. The default
>> paravirtualized clock will return the same values via
>> clock_gettime(CLOCK_MONOTONIC) if the timespan between two calls is too
>> short. This does not happen with 'clocksource=tsc'.

This shouldn't happen.  clock_gettime(CLOCK_MONOTONIC) should be
monotonic increasing.  Do you have a testcase?

The KVM clocksource is high-resolution and also TSC-based, the
difference is that it performs two multiplications instead of one.  The
first uses TSC parameters from the host.  The second, which is the one
in arch/x86/entry/vdso/vclock_gettime.c's do_hres function, will have a
1:1 multiplier (excluding adjtime shearing) because kvmclock already
returns nanoseconds.

> Newer Intels support TSC scaling for VMX, which could solve the problem. It
> affects TSC readout by:
> 
> 	TSC = (read(HWTSC) * multiplier) >> 48
> 
> So you can standarize on a TSC frequency accross a fleet. Not sure when
> that was introduced and no idea whether it's available on AMD.

It's Skylake (server parts only) or newer.  AMD instead has had it
(almost) forever.  QEMU 2.6 or newer will use it automatically across
live migration, if available.

> For a software solution we could try the following:
> 
>  1) Provide the raw TSC frequency of the host to the guest in some magic
>     software defined MSR or CPUID. If there is an existing mechanism, use
>     that.

This shouldn't be needed for two reasons:

1) you could also use kvmclock's provided mult/shift

2) I am not convinced that kvmclock has the behavior that Olaf mentions,
and if it does it would be a bug.

Paolo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ