linux-kernel - Re: [PATCH 5/8] KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <eed1cea4-409a-f03e-5c31-e82d49bb2101@maciej.szmigiero.name>
Date:   Wed, 6 Apr 2022 15:13:35 +0200
From:   "Maciej S. Szmigiero" <mail@...iej.szmigiero.name>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Maxim Levitsky <mlevitsk@...hat.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/8] KVM: SVM: Re-inject INT3/INTO instead of retrying the
 instruction

On 6.04.2022 03:48, Sean Christopherson wrote:
> On Mon, Apr 04, 2022, Maciej S. Szmigiero wrote:
(..)
>> Also, I'm not sure that even the proposed updated code above will
>> actually restore the L1-requested next_rip correctly on L1 -> L2
>> re-injection (will review once the full version is available).
> 
> Spoiler alert, it doesn't.  Save yourself the review time.  :-)
> 
> The missing piece is stashing away the injected event on nested VMRUN.  Those
> events don't get routed through the normal interrupt/exception injection code and
> so the next_rip info is lost on the subsequent #NPF.
> 
> Treating soft interrupts/exceptions like they were injected by KVM (which they
> are, technically) works and doesn't seem too gross.  E.g. when prepping vmcb02
> 
> 	if (svm->nrips_enabled)
> 		vmcb02->control.next_rip    = svm->nested.ctl.next_rip;
> 	else if (boot_cpu_has(X86_FEATURE_NRIPS))
> 		vmcb02->control.next_rip    = vmcb12_rip;
> 
> 	if (is_evtinj_soft(vmcb02->control.event_inj)) {
> 		svm->soft_int_injected = true;
> 		svm->soft_int_csbase = svm->vmcb->save.cs.base;
> 		svm->soft_int_old_rip = vmcb12_rip;
> 		if (svm->nrips_enabled)
> 			svm->soft_int_next_rip = svm->nested.ctl.next_rip;
> 		else
> 			svm->soft_int_next_rip = vmcb12_rip;
> 	}
> 
> And then the VMRUN error path just needs to clear soft_int_injected.

I am also a fan of parsing EVENTINJ from VMCB12 into relevant KVM
injection structures (much like EXITINTINFO is parsed), as I said to
Maxim two days ago [1].
Not only for software {interrupts,exceptions} but for all incoming
events (again, just like EXITINTINFO).

However, there is another issue related to L1 -> L2 event re-injection
using standard KVM event injection mechanism: it mixes the L1 injection
state with the L2 one.

Specifically for SVM:
* When re-injecting a NMI into L2 NMI-blocking is enabled in
vcpu->arch.hflags (shared between L1 and L2) and IRET intercept is
enabled.

This is incorrect, since it is L1 that is responsible for enforcing NMI
blocking for NMIs that it injects into its L2.
Also, *L2* being the target of such injection definitely should not block
further NMIs for *L1*.

* When re-injecting a *hardware* IRQ into L2 GIF is checked (previously
even on the BUG_ON() level), while L1 should be able to inject even when
L2 GIF is off,

With the code in my previous patch set I planned to use
exit_during_event_injection() to detect such case, but if we implement
VMCB12 EVENTINJ parsing we can simply add a flag that the relevant event
comes from L1, so its normal injection side-effects should be skipped.

By the way, the relevant VMX code also looks rather suspicious,
especially for the !enable_vnmi case.

Thanks,
Maciej

[1]: https://lore.kernel.org/kvm/7d67bc6f-00ac-7c07-f6c2-c41b2f0d35a1@maciej.szmigiero.name/