[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2e16b707-f020-22a3-a618-4960db917dfa@oracle.com>
Date: Fri, 6 Dec 2019 12:31:56 -0800
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Vitaly Kuznetsov <vkuznets@...hat.com>
Cc: x86@...nel.org, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Sean Christopherson <sean.j.christopherson@...el.com>,
Jim Mattson <jmattson@...gle.com>,
Liran Alon <liran.alon@...cle.com>,
linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org
Subject: Re: [PATCH RFC] KVM: x86: tell guests if the exposed SMT topology is
trustworthy
On 12/6/19 5:46 AM, Vitaly Kuznetsov wrote:
> Ankur Arora <ankur.a.arora@...cle.com> writes:
>
>> On 2019-11-05 3:56 p.m., Paolo Bonzini wrote:
>>> On 05/11/19 17:17, Vitaly Kuznetsov wrote:
>>>> There is also one additional piece of the information missing. A VM can be
>>>> sharing physical cores with other VMs (or other userspace tasks on the
>>>> host) so does KVM_FEATURE_TRUSTWORTHY_SMT imply that it's not the case or
>>>> not? It is unclear if this changes anything and can probably be left out
>>>> of scope (just don't do that).
>>>>
>>>> Similar to the already existent 'NoNonArchitecturalCoreSharing' Hyper-V
>>>> enlightenment, the default value of KVM_HINTS_TRUSTWORTHY_SMT is set to
>>>> !cpu_smt_possible(). KVM userspace is thus supposed to pass it to guest's
>>>> CPUIDs in case it is '1' (meaning no SMT on the host at all) or do some
>>>> extra work (like CPU pinning and exposing the correct topology) before
>>>> passing '1' to the guest.
>>>>
>>>> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
>>>> ---
>>>> Documentation/virt/kvm/cpuid.rst | 27 +++++++++++++++++++--------
>>>> arch/x86/include/uapi/asm/kvm_para.h | 2 ++
>>>> arch/x86/kvm/cpuid.c | 7 ++++++-
>>>> 3 files changed, 27 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/Documentation/virt/kvm/cpuid.rst b/Documentation/virt/kvm/cpuid.rst
>>>> index 01b081f6e7ea..64b94103fc90 100644
>>>> --- a/Documentation/virt/kvm/cpuid.rst
>>>> +++ b/Documentation/virt/kvm/cpuid.rst
>>>> @@ -86,6 +86,10 @@ KVM_FEATURE_PV_SCHED_YIELD 13 guest checks this feature bit
>>>> before using paravirtualized
>>>> sched yield.
>>>>
>>>> +KVM_FEATURE_TRUSTWORTHY_SMT 14 set when host supports 'SMT
>>>> + topology is trustworthy' hint
>>>> + (KVM_HINTS_TRUSTWORTHY_SMT).
>>>> +
>>>
>>> Instead of defining a one-off bit, can we make:
>>>
>>> ecx = the set of known "hints" (defaults to edx if zero)
>>>
>>> edx = the set of hints that apply to the virtual machine
>>>
>> Just to resurrect this thread, the guest could explicitly ACK
>> a KVM_FEATURE_DYNAMIC_HINT at init. This would allow the host
>> to change the hints whenever with the guest not needing to separately
>> ACK the changed hints.
>
> (I apologize for dropping the ball on this, I'm intended to do RFCv2 in
> a nearby future)
>
> Regarding this particular hint (let's call it 'no nonarchitectural
> coresharing' for now) I don't see much value in communicating change to
> guest when it happens. Imagine our host for some reason is not able to
> guarantee that anymore e.g. we've migrated to a host with less pCPUs
> and/or special restrictions and have to start sharing. What we, as a
> guest, are supposed to do when we receive a notification? "You're now
> insecure, deal with it!" :-) Equally, I don't see much value in
> pre-acking such change. "I'm fine with becoming insecure at some point".
True, for that use-case pre-ACK seems like exactly the thing you would
not want.
I do see some value in the guest receiving the notification though.
Maybe it could print a big fat printk or something :). Or, it could
change to a different security-policy-that-I-just-made-up.
> If we, however, discuss other hints such 'pre-ACK' mechanism may make
> sense, however, I'd make it an option to a 'challenge/response'
> protocol: if host wants to change a hint it notifies the guest and waits
> for an ACK from it (e.g. a pair of MSRs + an interrupt). I, however,
My main reason for this 'pre-ACK' approach is some discomfort with
changing the CPUID edx from under the guest.
The MSR+interrupt approach would work as well but then we have the
same set of hints spread across CPUID and the MSR. What do you think
is the right handling for a guest that refuses to ACK the MSR?
> have no good candidate from the existing hints which would require guest
> to ACK (e.g revoking PV EOI would probably do but why would we do that?)
> As I said before, challenge/response protocol is needed if we'd like to
> make TSC frequency change the way Hyper-V does it (required for updating
> guest TSC pages in nested case) but this is less and less important with
> the appearance of TSC scaling. I'm still not sure if this is an
> over-engineering or not. We can wait for the first good candidate to
> decide.
As we've discussed offlist, the particular hint I'm interested in is
KVM_HINT_REALTIME. That's not a particularly good candidate though
because there's no correctness problem if the host does switch it
off suddenly.
Ankur
Powered by blists - more mailing lists