linux-kernel - Re: [PATCH 0/2] KVM: SVM: Fix unexpected #UD on INT3 in SEV guests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bc6a9c1f-d41e-ef81-3029-04c2938b300c@amd.com>
Date:   Wed, 23 Aug 2023 15:36:25 -0500
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, Wu Zongyo <wuzongyo@...l.ustc.edu.cn>
Subject: Re: [PATCH 0/2] KVM: SVM: Fix unexpected #UD on INT3 in SEV guests

On 8/23/23 15:03, Sean Christopherson wrote:
> On Wed, Aug 23, 2023, Tom Lendacky wrote:
>> On 8/22/23 10:14, Sean Christopherson wrote:
>>> On Tue, Aug 22, 2023, Tom Lendacky wrote:
>>>> On 8/10/23 18:49, Sean Christopherson wrote:
>>>>> Fix a bug where KVM injects a bogus #UD for SEV guests when trying to skip
>>>>> an INT3 as part of re-injecting the associated #BP that got kinda sorta
>>>>> intercepted due to a #NPF occuring while vectoring/delivering the #BP.
>>>>>
>>>>> I haven't actually confirmed that patch 1 fixes the bug, as it's a
>>>>> different change than what I originally proposed.  I'm 99% certain it will
>>>>> work, but I definitely need verification that it fixes the problem
>>>>>
>>>>> Patch 2 is a tangentially related cleanup to make NRIPS a requirement for
>>>>> enabling SEV, e.g. so that we don't ever get "bug" reports of SEV guests
>>>>> not working when NRIPS is disabled.
>>>>>
>>>>> Sean Christopherson (2):
>>>>>      KVM: SVM: Don't inject #UD if KVM attempts emulation of SEV guest w/o
>>>>>        insn
>>>>>      KVM: SVM: Require nrips support for SEV guests (and beyond)
>>>>>
>>>>>     arch/x86/kvm/svm/sev.c |  2 +-
>>>>>     arch/x86/kvm/svm/svm.c | 37 ++++++++++++++++++++-----------------
>>>>>     arch/x86/kvm/svm/svm.h |  1 +
>>>>>     3 files changed, 22 insertions(+), 18 deletions(-)
>>>>
>>>> We ran some stress tests against a version of the kernel without this fix
>>>> and we're able to reproduce the issue, but not reliably, after a few hours.
>>>> With this patch, it has not reproduced after running for a week.
>>>>
>>>> Not as reliable a scenario as the original reporter, but this looks like it
>>>> resolves the issue.
>>>
>>> Thanks Tom!  I'll apply this for v6.6, that'll give us plenty of time to change
>>> course if necessary.
>>
>> I may have spoke to soon...  When the #UD was triggered it was here:
>>
>> [    0.118524] Spectre V2 : Enabling Restricted Speculation for firmware calls
>> [    0.118524] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
>> [    0.118524] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl
>> [    0.118524] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
>> [    0.118524] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.2.2-amdsos-build50-ubuntu-20.04+ #1
>> [    0.118524] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
>> [    0.118524] RIP: 0010:int3_selftest_ip+0x0/0x60
>> [    0.118524] Code: b9 25 05 00 00 48 c7 c2 e8 7c 80 b0 48 c7 c6 fe 1c d3 b0 48 c7 c7 f0 7d da b0 e8 4c 2c 0b ff e8 75 da 15 ff 0f 0b 48 8d 7d f4 <cc> 90 90 90 90 83 7d f4 01 74 2f 80 3d 39 7f a8 00 00 74 24 b9 34
>>
>>
>> Now (after about a week) we've encountered a hang here:
>>
>> [    0.106216] Spectre V2 : Enabling Restricted Speculation for firmware calls
>> [    0.106216] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
>> [    0.106216] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl
>>
>> It is in the very same spot and so I wonder if the return false (without
>> queuing a #UD) is causing an infinite loop here that appears as a guest
>> hang. Whereas, we have some systems running the first patch that you
>> created that have not hit this hang.
>>
>> But I'm not sure why or how this patch could cause the guest hang. I
>> would think that the retry of the instruction would resolve everything
>> and the guest would continue. Unfortunately, the guest was killed, so I'll
>> try to reproduce and get a dump or trace points of the VM to see what is
>> going on.
> 
> Gah, it's because x86_emulate_instruction() returns '1' and not '0' when
> svm_can_emulate_instruction() returns false.  svm_update_soft_interrupt_rip()
> would then continue with the injection, i.e. inject #BP on the INT3 RIP, not on
> the RIP following the INT3, which would cause this check to fail
> 
> 	if (regs->ip - INT3_INSN_SIZE != selftest)
> 		return NOTIFY_DONE;
> 
> and eventually send do_trap_no_signal() to die().

Thanks for figuring that out so quickly!

> 
> I distinctly remember seeing the return value problem when writing the patch, but
> missed that it would result in KVM injecting the unexpected #BP.
> 
> I punted on trying to properly fix this by having can_emulate_instruction()
> differentiate between "retry insn" and "inject exception", because that change
> is painfully invasive and I though I could get away with the simple fix.  Drat.
> 
> I think the best option is to add a "temporary" patch so that the fix for @stable
> is short, sweet, and safe, and then do the can_emulate_instruction() cleanup that
> I was avoiding.
> 
> E.g. this as patch 2/4 (or maybe 2/5) of this series:

2/4 or 2/5? Do you mean 2/3 since there were only 2 patches in the series?

I'll apply the below patch in between patches 1 and 2 and re-test. Should 
have results in a week :)

Thanks,
Tom

> 
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 7cb5ef5835c2..8457a36b44c1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -364,6 +364,8 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
>                  svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK;
>   
>   }
> +static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> +                                       void *insn, int insn_len);
>   
>   static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
>                                             bool commit_side_effects)
> @@ -384,6 +386,14 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
>          }
>   
>          if (!svm->next_rip) {
> +               /*
> +                * FIXME: Drop this when kvm_emulate_instruction() does the
> +                * right thing and treats "can't emulate" as outright failure
> +                * for EMULTYPE_SKIP.
> +                */
> +               if (!svm_can_emulate_instruction(vcpu, EMULTYPE_SKIP, NULL, 0))
> +                       return 0;
> +
>                  if (unlikely(!commit_side_effects))
>                          old_rflags = svm->vmcb->save.rflags;