lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 16 Sep 2016 17:24:44 +0200
From:   Radim Krčmář <rkrcmar@...hat.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        dmatlack@...gle.com, luto@...nel.org, peterhornyack@...gle.com,
        x86@...nel.org
Subject: Re: [PATCH 2/2] x86, kvm: use kvmclock to compute TSC deadline value

2016-09-16 17:06+0200, Paolo Bonzini:
> On 16/09/2016 16:59, Radim Krčmář wrote:
>> KVM_MSR_DEADLINE would be interface in kvmclock nanosecond values and
>> MSR_IA32_TSCDEADLINE in TSC values.  KVM_MSR_DEADLINE would follow
>> similar rules as MSR_IA32_TSCDEADLINE -- the interrupt fires when
>> kvmclock reaches the value, you read what you write, and 0 disarms it.
>> 
>> If the TSC deadline timer was enabled, then the guest could write to
>> both MSR_IA32_TSCDEADLINE and KVM_MSR_DEADLINE, but only one could be
>> armed at any time (non-zero write to one will set the other to 0).
>> 
>> The dual interface would allow unconditinal addition of the PV feature
>> without regressing users that currently use MSR_IA32_TSCDEADLINE and
>> adapted their stack to handle KVM's TSC shortcomings ...
> 
> So far so good.  My question is: what happens if you write to
> KVM_MSR_DEADLINE and read from MSR_IA32_TSCDEADLINE, or vice versa?

(The second paragraph covered it ;])

> The possibilities are:
> 
> a) you read a 0

This one.

> b) you read the value converted to the other unit

Too much hassle. :)

> c) you read another value such as -1

Having common "disarmed" value is nicer and MSR_IA32_TSCDEADLINE has 0.

> (a) and (c) are the simplest of course.  (c) may make sense when writing
> to MSR_IA32_TSCDEADLINE and reading from KVM_MSR_DEADLINE, since we can
> decide which values are valid or not; -1 is technically a valid TSC
> deadline.
> 
> I'm not sure about whether to allow (b).  In the end KVM is going to
> convert a nsec deadline to a TSC value internally, and vice versa.

It is not necessary to convert nsec deadline to guest-TSC, only to
host-TSC in case the VMX_PREEPTION_TIMER is used.
I would only have the host-TSC internal representation, which is not
exportable to the guest or migratable.

>                                                                     On
> the other hand, if we do, userspace needs to figure out (on migration)
> whether the guest set up a TSC or a nanosecond deadline.

Yeah, I think the solution described below (writing 0 doesn't disarm the
other one) is not bad.

>>>                   this lets userspace decide whether to set a nsec-based
>>> deadline or a TSC-based deadline after migration.
>> 
>> Hm, isn't switching to TSC-based deadline after migration pointless?
> 
> Yes, but I didn't mean that.  I meant preserving which MSR was written
> to arm the timer, and redoing the same on the destination.

Ah, I see.  Both MSRs read what deadline written to them (if they are
armed) and at most one can be non-zero.
KVM will add MSR_IA32_TSCDEADLINE to the list of emulated MSRs, so
userspace will save/restore both deadline MSRs and zero writes will not
disarm the other timer, so the correct timer will be armed.

No special logic to try to avoid TSC-related bugs.

>>>>>             This still wouldn't handle old hosts of course.
>>>>
>>>> The question is whether we want to carry around 150 LOC because of old
>>>> hosts.  I'd just fix Linux to avoid deadline TSC without invariant TSC.
>>>> :)
>>>
>>> Yes, that would automatically blacklist it on KVM.  You'd also need to
>>> update the recent optimization to the TSC deadline timer, to also work
>>> on other APIC timer modes or at least in your new PV mode.
>> 
>> All modes shouldn't be much harder than just the PV mode.
> 
> The PV mode would still be a bit easier since it's still the TSC
> deadline timer just with a nicer interface that is not based on the TSC.
>  Depends on how you code it though, I guess.

Yeah, we'll see.  I am planning to carry around the deadline value in
nanoseconds (to avoid needless conversions), so it would have similar
requirements as the APIC timer.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ