lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160916145957.GF17296@potion>
Date:   Fri, 16 Sep 2016 16:59:58 +0200
From:   Radim Krčmář <rkrcmar@...hat.com>
To:     Paolo Bonzini <pbonzini@...hat.com>
Cc:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        dmatlack@...gle.com, luto@...nel.org, peterhornyack@...gle.com,
        x86@...nel.org
Subject: Re: [PATCH 2/2] x86, kvm: use kvmclock to compute TSC deadline value

2016-09-15 23:02+0200, Paolo Bonzini:
> On 15/09/2016 21:59, Radim Krčmář wrote:
>> 2016-09-15 18:00+0200, Paolo Bonzini:
>>>> When we are already going the paravirtual route, we could add an
>>>> interface that accepts the deadline in kvmclock nanoseconds.
>>>> It would be much more maintanable than adding a fragile paravirtual
>>>> layer on top of random interfaces.
>>>
>>> Good idea.
>> 
>> I'll prepare a prototype.
> 
> So how would this work?  A single MSR, used after setting TSC deadline
> mode in LVTT?  Could you write it and read TSC deadline or vice versa?

So far, I think that adding KVM_MSR_DEADLINE (probably more descriptive
name in the end) that works only in LVTT mode seems reasonable.

I am tempted to add a second LVTT-like MSR to completely isolate it from
LAPIC timers, but sharing the VMX_PREEMPTION_TIMER would be needlessly
complicated.

>                Could you write it and read TSC deadline or vice versa?

KVM_MSR_DEADLINE would be interface in kvmclock nanosecond values and
MSR_IA32_TSCDEADLINE in TSC values.  KVM_MSR_DEADLINE would follow
similar rules as MSR_IA32_TSCDEADLINE -- the interrupt fires when
kvmclock reaches the value, you read what you write, and 0 disarms it.

If the TSC deadline timer was enabled, then the guest could write to
both MSR_IA32_TSCDEADLINE and KVM_MSR_DEADLINE, but only one could be
armed at any time (non-zero write to one will set the other to 0).

The dual interface would allow unconditinal addition of the PV feature
without regressing users that currently use MSR_IA32_TSCDEADLINE and
adapted their stack to handle KVM's TSC shortcomings ...

> My idea would be "yes" for writing nsec deadline and reading TSC
> deadline, but "no" for writing TSC deadline and reading nsec deadline.
> In the latter case, reading nsec deadline might return an impossible
> value such as -1;

Both MSRs would read what was written or 0 if fired/disarmed in between.
I'm not sure if I understood what you meant, though.

>                   this lets userspace decide whether to set a nsec-based
> deadline or a TSC-based deadline after migration.

Hm, isn't switching to TSC-based deadline after migration pointless?
We don't have any migration notifiers, so the guest interface would have
to always check what interface to use.

>>>             This still wouldn't handle old hosts of course.
>> 
>> The question is whether we want to carry around 150 LOC because of old
>> hosts.  I'd just fix Linux to avoid deadline TSC without invariant TSC.
>> :)
> 
> Yes, that would automatically blacklist it on KVM.  You'd also need to
> update the recent optimization to the TSC deadline timer, to also work
> on other APIC timer modes or at least in your new PV mode.

All modes shouldn't be much harder than just the PV mode.

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ