[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76901fa6-f26e-7920-4ab4-04129c6d7a2b@redhat.com>
Date: Thu, 10 Feb 2022 17:40:25 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Cc: Sean Christopherson <seanjc@...gle.com>,
David Woodhouse <dwmw2@...radead.org>,
Alexander Graf <graf@...zon.de>, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
x86@...nel.org, kvm@...r.kernel.org
Subject: Re: [PATCH MANUALSEL 5.15 7/8] KVM: VMX: Set vmcs.PENDING_DBG.BS on
#DB in STI/MOVSS blocking shadow
On 2/9/22 19:56, Sasha Levin wrote:
> From: Sean Christopherson <seanjc@...gle.com>
>
> [ Upstream commit b9bed78e2fa9571b7c983b20666efa0009030c71 ]
Acked-by: Paolo Bonzini <pbonzini@...hat.com>
Paolo
> Set vmcs.GUEST_PENDING_DBG_EXCEPTIONS.BS, a.k.a. the pending single-step
> breakpoint flag, when re-injecting a #DB with RFLAGS.TF=1, and STI or
> MOVSS blocking is active. Setting the flag is necessary to make VM-Entry
> consistency checks happy, as VMX has an invariant that if RFLAGS.TF is
> set and STI/MOVSS blocking is true, then the previous instruction must
> have been STI or MOV/POP, and therefore a single-step #DB must be pending
> since the RFLAGS.TF cannot have been set by the previous instruction,
> i.e. the one instruction delay after setting RFLAGS.TF must have already
> expired.
>
> Normally, the CPU sets vmcs.GUEST_PENDING_DBG_EXCEPTIONS.BS appropriately
> when recording guest state as part of a VM-Exit, but #DB VM-Exits
> intentionally do not treat the #DB as "guest state" as interception of
> the #DB effectively makes the #DB host-owned, thus KVM needs to manually
> set PENDING_DBG.BS when forwarding/re-injecting the #DB to the guest.
>
> Note, although this bug can be triggered by guest userspace, doing so
> requires IOPL=3, and guest userspace running with IOPL=3 has full access
> to all I/O ports (from the guest's perspective) and can crash/reboot the
> guest any number of ways. IOPL=3 is required because STI blocking kicks
> in if and only if RFLAGS.IF is toggled 0=>1, and if CPL>IOPL, STI either
> takes a #GP or modifies RFLAGS.VIF, not RFLAGS.IF.
>
> MOVSS blocking can be initiated by userspace, but can be coincident with
> a #DB if and only if DR7.GD=1 (General Detect enabled) and a MOV DR is
> executed in the MOVSS shadow. MOV DR #GPs at CPL>0, thus MOVSS blocking
> is problematic only for CPL0 (and only if the guest is crazy enough to
> access a DR in a MOVSS shadow). All other sources of #DBs are either
> suppressed by MOVSS blocking (single-step, code fetch, data, and I/O),
> are mutually exclusive with MOVSS blocking (T-bit task switch), or are
> already handled by KVM (ICEBP, a.k.a. INT1).
>
> This bug was originally found by running tests[1] created for XSA-308[2].
> Note that Xen's userspace test emits ICEBP in the MOVSS shadow, which is
> presumably why the Xen bug was deemed to be an exploitable DOS from guest
> userspace. KVM already handles ICEBP by skipping the ICEBP instruction
> and thus clears MOVSS blocking as a side effect of its "emulation".
>
> [1] http://xenbits.xenproject.org/docs/xtf/xsa-308_2main_8c_source.html
> [2] https://xenbits.xen.org/xsa/advisory-308.html
>
> Reported-by: David Woodhouse <dwmw2@...radead.org>
> Reported-by: Alexander Graf <graf@...zon.de>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> Message-Id: <20220120000624.655815-1-seanjc@...gle.com>
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> Signed-off-by: Sasha Levin <sashal@...nel.org>
> ---
> arch/x86/kvm/vmx/vmx.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2ab0e997e39fa..44da933a756b3 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4791,8 +4791,33 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
> dr6 = vmx_get_exit_qual(vcpu);
> if (!(vcpu->guest_debug &
> (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))) {
> + /*
> + * If the #DB was due to ICEBP, a.k.a. INT1, skip the
> + * instruction. ICEBP generates a trap-like #DB, but
> + * despite its interception control being tied to #DB,
> + * is an instruction intercept, i.e. the VM-Exit occurs
> + * on the ICEBP itself. Note, skipping ICEBP also
> + * clears STI and MOVSS blocking.
> + *
> + * For all other #DBs, set vmcs.PENDING_DBG_EXCEPTIONS.BS
> + * if single-step is enabled in RFLAGS and STI or MOVSS
> + * blocking is active, as the CPU doesn't set the bit
> + * on VM-Exit due to #DB interception. VM-Entry has a
> + * consistency check that a single-step #DB is pending
> + * in this scenario as the previous instruction cannot
> + * have toggled RFLAGS.TF 0=>1 (because STI and POP/MOV
> + * don't modify RFLAGS), therefore the one instruction
> + * delay when activating single-step breakpoints must
> + * have already expired. Note, the CPU sets/clears BS
> + * as appropriate for all other VM-Exits types.
> + */
> if (is_icebp(intr_info))
> WARN_ON(!skip_emulated_instruction(vcpu));
> + else if ((vmx_get_rflags(vcpu) & X86_EFLAGS_TF) &&
> + (vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) &
> + (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)))
> + vmcs_writel(GUEST_PENDING_DBG_EXCEPTIONS,
> + vmcs_readl(GUEST_PENDING_DBG_EXCEPTIONS) | DR6_BS);
>
> kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
> return 1;
Powered by blists - more mailing lists