[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fc82a8a7-af38-5037-1862-ba2315c4e5af@amd.com>
Date: Fri, 26 May 2023 13:16:32 +1000
From: Alexey Kardashevskiy <aik@....com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, x86@...nel.org, linux-kernel@...r.kernel.org,
Tom Lendacky <thomas.lendacky@....com>,
Pankaj Gupta <pankaj.gupta@....com>,
Nikunj A Dadhania <nikunj@....com>,
Santosh Shukla <santosh.shukla@....com>,
Carlos Bilbao <carlos.bilbao@....com>
Subject: Re: [PATCH kernel v5 5/6] KVM: SEV: Enable data breakpoints in SEV-ES
On 24/5/23 01:44, Sean Christopherson wrote:
> On Tue, May 23, 2023, Alexey Kardashevskiy wrote:
>>
>>
>> On 23/5/23 09:39, Sean Christopherson wrote:
>>> On Tue, Apr 11, 2023, Alexey Kardashevskiy wrote:
>>>> Prior to SEV-ES, KVM saved/restored host debug registers upon switching
>>>> to/from a VM. Changing those registers inside a running SEV VM
>>>> triggered #VMEXIT to KVM.
>>>
>>> Please, please don't make it sound like some behavior is *the* one and only behavior.
>>> *KVM* *chooses* to intercept debug register accesses. Though I would omit this
>>> paragraph (and largely the next) entirely, IMO it's safe to assume the reader has
>>> a basic understanding of how KVM deals with DRs and #DBs.
>>
>> Out of curiosity - is ARM similar in this regard, interceptions and stuff?
>
> Yes. The granularity of interceptions varies based on the architectural revision,
> and presumably there are things that always trap. But the core concept of letting
> software decide what to intercept is the same.
>
>>>> SEV-ES added encrypted state (ES) which uses an encrypted page
>>>> for the VM state (VMSA). The hardware saves/restores certain registers
>>>> on VMRUN/VMEXIT according to a swap type (A, B, C), see
>>>> "Table B-3. Swap Types" in the AMD Architecture Programmer’s Manual
>>>> volume 2.
>>>>
>>>> The DR6 and DR7 registers have always been swapped as Type A for SEV-ES
>>>
>>> Please rewrite this to just state what the behavior is instead of "Type A" vs.
>>> "Type B". Outside of AMD, the "type a/b/c" stuff isn't anywhere near ubiquitous
>>> enough to justify obfuscating super simple concepts.
>>>
>>> Actually, this feature really has nothing to do with Type A vs. Type B, since
>>> that's purely about host state.
>>
>> Mmm. Nothing? If the feature is enabled and DR[0-3] are not saved in HOSTSA,
>> then the host looses debug state because of the working feature.
>>
>>> I assume the switch from Type A to Type B is a
>>> side effect, or an opportunistic optimization?
>>
>> There is no switch. DR[67] were and are one type, and the other DRs were not
>> swapped and now are, but with a different Swap Type.
>
> Ah, this is what I missed.
>
>> And the patch saves DR[0-3] in HOSTSA but not DR[67] and this deserves some
>> explaining why is that.
>>
>>> Regardless, something like this is a lot easier for a human to read. It's easy
>>> enough to find DebugSwap in the APM (literally took me longer to find my copy of
>>> the PDF).
>>
>> It is also easy to find "Swap Types" in the APM...
>
> Redirecting readers to specs for gory details is fine. Redirecting for basic,
> simple functionality is not. Maybe the swap types will someday be common knowledge
> in KVM, and an explanation will no longer be necessary, but we are nowhere near
> that point.
>
>>> Add support for "DebugSwap for SEV-ES guests", which provides support
>>> for swapping DR[0-3] and DR[0-3]_ADDR_MASK on VMRUN and VMEXIT, i.e.
>>> allows KVM to expose debug capabilities to SEV-ES guests. Without
>>> DebugSwap support, the CPU doesn't save/load _guest_ debug registers,
>>
>> But it does save/load DR6 and DR7.
>>
>>> and KVM cannot manually context switch guest DRs due the VMSA being
>>> encrypted.
>>>
>>> Enable DebugSwap if and only if the CPU also supports NoNestedDataBp,
>>> which causes the CPU to ignore nested #DBs, i.e. #DBs that occur when
>>> vectoring a #DB.
>>
>> (english question) What does "vectoring" mean here precisely? Handling?
>> Processing?
>
> It's not really an English thing. "Vectoring" is, or at least was, Intel terminology
> for describing the flow from the initial detection of an exception to the exception's
> delivery to software, e.g. covers the IDT lookup, any GDT/LDT lookups, pushing
> information on the stack, fetching the software exception handler, etc. Importantly,
> it describes the period where there are no instructions retired and thus ucode doesn't
> open event windows, i.e. where triggering another, non-contributory exception will
> effectively "hang" the CPU (at least on CPUs without Intel's "notify" VM-Exit support).
>
>>> the host by putting the CPU into an infinite loop of vectoring #DBs
>>> (see https://bugzilla.redhat.com/show_bug.cgi?id=1278496)
>>
>> Good one, thanks for the link.
>>
>>>
>>> Set the features bit in sev_es_sync_vmsa() which is the last point
>>> when VMSA is not encrypted yet as sev_(es_)init_vmcb() (where the most
>>> init happens) is called not only when VCPU is initialized but also on
>>> intrahost migration when VMSA is encrypted.
>>>
>>>> guests, but a new feature is available, identified via
>>>> CPUID Fn8000001F_EAX[14] "DebugSwap for SEV-ES guests", that provides
>>>> support for swapping additional debug registers. DR[0-3] and
>>>> DR[0-3]_ADDR_MASK are swapped as Type B when SEV_FEATURES[5] (DebugSwap)
>>>> is set.
>>>>
>>>> Enable DebugSwap for a VMSA but only do so if CPUID Fn80000021_EAX[0]
>>>> ("NoNestedDataBp", "Processor ignores nested data breakpoints") is
>>>> supported by the SOC as otherwise a malicious SEV-ES guest can set up
>>>> data breakpoints on the #DB IDT entry/stack and cause an infinite loop.
>>>> Set the features bit in sev_es_sync_vmsa() which is the last point
>>>> when VMSA is not encrypted yet as sev_(es_)init_vmcb() (where the most
>>>> init happens) is called not only when VCPU is initialized but also on
>>>> intrahost migration when VMSA is encrypted.
>>>>
>>>> Eliminate DR7 and #DB intercepts as:
>>>> - they are not needed when DebugSwap is supported;
>>>
>>> "not needed" isn't sufficient justification. KVM doesn't strictly need to do a
>>> lot of things, but does them anyways for a variety of reasons. E.g. #DB intercept
>>> is also not needed when NoNestedDataBp is supported and vcpu->guest_debug==0, i.e.
>>> this should clarify why KVM doesn't simply disable #DB intercepts for _all_ VMs
>>> when NoNestedDataBp is support. Presumably the answer is "because it's simpler
>>> than toggling #DB interception for guest_debug.
>>
>> TBH I did not have a good answer but from the above I'd say we want to
>> disable #DB intercepts in a separate patch, just as you say below.
>>
>>> Actually, can't disabling #DB interception for DebugSwap SEV-ES guests be a
>>> separate patch? KVM can still inject #DBs for SEV-ES guests, no?
>>
>> Sorry for my ignorance but what is the point of injecting #DB if there is no
>> way of changing the guest's DR7?
>
> Well, _injecting_ the #DB is necessary for correctness from the guest's perspective.
> "What's the point of _intercepting_ #DB" is the real question. And for SEV-ES guests
> with DebugSwap, there is no point, which is why I agree that KVM should disable
> interception in that case. What I'm calling out is that disabling #Db interception
> isn't _necessary_ for correctness (unless I'm missing something), which means that
> it can and should go in a separate patch.
About this. Instead of sev_es_init_vmcb(), I can toggle the #DB
intercept when toggling guest_debug, see below. This
kvm_x86_ops::update_exception_bitmap hook is called on vcpu reset and
kvm_arch_vcpu_ioctl_set_guest_debug (which skips this call if
guest_state_protected = true). Is there any downside?
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index da28ed69bf4a..a7df5eb4ac00 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1951,9 +1951,15 @@ static void svm_update_exception_bitmap(struct
kvm_vcpu *vcpu)
clr_exception_intercept(svm, BP_VECTOR);
+ if (cpu_feature_enabled(X86_FEATURE_NO_NESTED_DATA_BP))
+ clr_exception_intercept(svm, DB_VECTOR);
+
if (vcpu->guest_debug & KVM_GUESTDBG_ENABLE) {
if (vcpu->guest_debug & KVM_GUESTDBG_USE_SW_BP)
set_exception_intercept(svm, BP_VECTOR);
+
+ if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP)
+ set_exception_intercept(svm, DB_VECTOR);
}
}
--
Alexey
Powered by blists - more mailing lists