lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Jun 2023 13:58:06 +1000
From:   Alexey Kardashevskiy <aik@....com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     kvm@...r.kernel.org, x86@...nel.org, linux-kernel@...r.kernel.org,
        Tom Lendacky <thomas.lendacky@....com>,
        Pankaj Gupta <pankaj.gupta@....com>,
        Nikunj A Dadhania <nikunj@....com>,
        Santosh Shukla <santosh.shukla@....com>,
        Carlos Bilbao <carlos.bilbao@....com>
Subject: Re: [PATCH kernel v5 5/6] KVM: SEV: Enable data breakpoints in SEV-ES

On 14/6/23 09:19, Sean Christopherson wrote:
> On Fri, Jun 02, 2023, Alexey Kardashevskiy wrote:
>> Sean, ping?
>> =20
>> I wonder if this sev-es-not-singlestepping is a showstopper or it is alri=
> ght
>> to repost this patchset without it? Thanks,
> 
> Ah, shoot, I completely lost this in my inbox.  Sorry :-/

I saw the "OOO" message the other day and relaxed :)


>>>> Side topic, isn't there an existing bug regarding SEV-ES NMI windows?
>>>> KVM can't actually single-step an SEV-ES guest, but tries to set
>>>> RFLAGS.TF anyways.
>>> =20
>>> Why is it a "bug" and what does the patch fix? Sound to me as it is
>>> pointless and the guest won't do single stepping and instead will run
>>> till it exits somehow, what do I miss?
> 
> The bug is benign in the end, but it's still a bug.  I'm not worried about =


(unrelated) Your response's encoding broke somehow and I wonder if this 
is something I did or you did. Lore got it too:

https://lore.kernel.org/all/ZIj5ms+DohcLyXHE@google.com/


> fixing
> any behavior, but I dislike having dead, misleading code, especially for so=
> mething
> like this where both NMI virtualization and SEV-ES are already crazy comple=
> x and
> subtle.  I think it's safe to say that I've spent more time digging through=
>   SEV-ES
> and NMI virtualization than most KVM developers, and as evidenced by the nu=
> mber of
> things I got wrong below, I'm still struggling to keep track of the bigger =
> picture.
> Developers that are new to all of this need as much help as they can get.
> 
>>>> Blech, and suppressing EFER.SVME in efer_trap() is a bit gross,
>>> =20
>>> Why suppressed? svm_set_efer() sets it eventually anyway.
> 
> svm_set_efer() sets SVME in hardware, but KVM's view of the guest's value t=
> hat's
> stored in vcpu->arch.efer doesn't have SVME set.  E.g. from the guest's per=
> spective,
> EFER.SVME will have "Reserved Read As Zero" semantics.

It is not zero, why? From inside the guest, rdmsrl(MSR_EFER, efer) reads 
0x1d01 from that msr where 0x1000==(1<<_EFER_SVME),  _EFER_SVME==12.


> 
>>>> but I suppose since the GHCB doesn't allow for CLGI or STGI it's "fin=
> e".
>>> =20
>>> GHCB does not mention this, instead these are always intercepted in
>>> init_vmcb().
> 
> Right, I'm calling out that the absense of protocol support for requesting =
> CLGI
> or STGI emulation means dropping the guest's EFER.SVME is ok (though gross =
> :-) ).
> 
>>>> E.g. shouldn't KVM do this?
>>> =20
>>> It sure can and I am happy to include this into the series, the commit
>>> log is what I am struggling with :)
>>> =20
>>>> =20
>>>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>>>> index ca32389f3c36..4e4a49031efe 100644
>>>> --- a/arch/x86/kvm/svm/svm.c
>>>> +++ b/arch/x86/kvm/svm/svm.c
>>>> @@ -3784,6 +3784,16 @@ static void svm_enable_nmi_window(struct
>>>> kvm_vcpu *vcpu)
>>>> =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=
> =BD if (svm_get_nmi_mask(vcpu) && !svm->awaiting_iret_completion)
>>>> =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=
> =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=
>   return; /* IRET will cause a vm exit */
>>>> +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD /*
>>>> +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD * KV=
> M can't single-step SEV-ES guests and instead assumes
>>>> that IRET
>>>> +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD * in=
>   the guest will always succeed,
>>> =20
>>> It relies on GHCB's NMI_COMPLETE (which SVM than handles is it was IRET=
> ):
>>> =20
>>>   =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD case S=
> VM_VMGEXIT_NMI_COMPLETE:
>>>   =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=
> =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD ret =3D =
> svm_invoke_exit_handler(vcpu, SVM_EXIT_IRET);
>>>   =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=
> =BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD break;
> 
> Ah, right, better to say that the guest is responsible for signaling that i=
> t's
> ready to accept NMIs, which KVM handles by "emulating" IRET.
> 
>>>> i.e. clears NMI masking on the
>>>> +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD * ne=
> xt VM-Exit.=EF=BF=BD Note, GIF is guaranteed to be '1' for
>>>> SEV-ES guests
>>>> +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD * as=
>   the GHCB doesn't allow for CLGI or STGI (and KVM suppresses
>>>> +=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD * EF=
> ER.SVME for good measure, see efer_trap()).
>>> =20
>>> SVM KVM seems to not enforce EFER.SVME, the guest does what it wants an=
> d
>>> KVM is only told the new value via EFER_WRITE_TRAP. And "writes by
>>> SEV-ES guests to EFER.SVME are always ignored by hardware" says the APM=
> .
> 
> Ahhh, that blurb in the APM is what I'm missing.
> 
> Actually, there's a real bug here.  KVM doesn't immediately unmask NMIs in =
> response
> to NMI_COMPLETE, and instead goes through the whole awaiting_iret_completio=
> n =3D>
> svm_complete_interrupts(), which means that KVM doesn't unmask NMIs until t=
> he
> *next* VM-Exit.  Theoretically, that could be never, e.g. if the host is ti=
> ckless
> and the guest is configured to busy wait idle CPUs.
> 
> Attached patches are compile tested only.

Well, NMIs still get injected from QEMU so I guess it is a pass? Thanks,

-- 
Alexey

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ