lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <ecb314c53c76bc6d2233a8b4d783a15297198ef8.camel@cyberus-technology.de>
Date: Tue, 16 Apr 2024 15:08:36 +0000
From: Thomas Prescher <thomas.prescher@...erus-technology.de>
To: "seanjc@...gle.com" <seanjc@...gle.com>, Julian Stecklina
	<julian.stecklina@...erus-technology.de>
CC: "x86@...nel.org" <x86@...nel.org>, "dave.hansen@...ux.intel.com"
	<dave.hansen@...ux.intel.com>, "hpa@...or.com" <hpa@...or.com>,
	"mingo@...hat.com" <mingo@...hat.com>, "tglx@...utronix.de"
	<tglx@...utronix.de>, "bp@...en8.de" <bp@...en8.de>, "kvm@...r.kernel.org"
	<kvm@...r.kernel.org>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/2] KVM: nVMX: fix CR4_READ_SHADOW when L0 updates CR4
 during a signal

Hi Sean,

On Tue, 2024-04-16 at 07:35 -0700, Sean Christopherson wrote:
> On Tue, Apr 16, 2024, Julian Stecklina wrote:
> > From: Thomas Prescher <thomas.prescher@...erus-technology.de>
> > 
> > This issue occurs when the kernel is interrupted by a signal while
> > running a L2 guest. If the signal is meant to be delivered to the
> > L0
> > VMM, and L0 updates CR4 for L1, i.e. when the VMM sets
> > KVM_SYNC_X86_SREGS in kvm_run->kvm_dirty_regs, the kernel programs
> > an
> > incorrect read shadow value for L2's CR4.
> > 
> > The result is that the guest can read a value for CR4 where bits
> > from
> > L1 have leaked into L2.
> 
> No, this is a userspace bug.  If L2 is active when userspace stuffs
> register state,
> then from KVM's perspective the incoming value is L2's value.  E.g.
> if userspace
> *wants* to update L2 CR4 for whatever reason, this patch would result
> in L2 getting
> a stale value, i.e. the value of CR4 at the time of VM-Enter.
> 
> And even if userspace wants to change L1, this patch is wrong, as KVM
> is writing
> vmcs02.GUEST_CR4, i.e. is clobbering the L2 CR4 that was programmed
> by L1, *and*
> is dropping the CR4 value that userspace wanted to stuff for L1.
> 
> To fix this, your userspace needs to either wait until L2 isn't
> active, or force
> the vCPU out of L2 (which isn't easy, but it's doable if absolutely
> necessary).

What you say makes sense. Is there any way for
userspace to detect whether L2 is currently active after
returning from KVM_RUN? I couldn't find anything in the official
documentation https://docs.kernel.org/virt/kvm/api.html

Can you point me into the right direction?

> 
> Pulling in a snippet from the initial bug report[*],
> 
>  : The reason why this triggers in VirtualBox and not in Qemu is that
> there are
>  : cases where VirtualBox marks CR4 dirty even though it hasn't
> changed.
> 
> simply not trying to stuff register state dirty when L2 is active
> sounds like it
> would resolve the issue.
> 
> https://lore.kernel.org/all/af2ede328efee9dc3761333bd47648ee6f752686.camel@cyberus-technology.de
> 
> > We found this issue by running uXen [1] as L2 in VirtualBox/KVM
> > [2].
> > The issue can also easily be reproduced in Qemu/KVM if we force a
> > sreg
> > sync on each call to KVM_RUN [3]. The issue can also be reproduced
> > by
> > running a L2 Windows 10. In the Windows case, CR4.VMXE leaks from
> > L1
> > to L2 causing the OS to blue-screen with a kernel thread exception
> > during TLB invalidation where the following code sequence triggers
> > the
> > issue:
> > 
> > mov rax, cr4 <--- L2 reads CR4 with contents from L1
> > mov rcx, cr4
> > btc 0x7, rax <--- L2 toggles CR4.PGE
> > mov cr4, rax <--- #GP because L2 writes CR4 with reserved bits set
> > mov cr4, rcx
> > 
> > The existing code seems to fixup CR4_READ_SHADOW after calling
> > vmx_set_cr4 except in __set_sregs_common. While we could fix it
> > there
> > as well, it's easier to just handle it centrally.
> > 
> > There might be a similar issue with CR0.
> > 
> > [1] https://github.com/OpenXT/uxen
> > [2] https://github.com/cyberus-technology/virtualbox-kvm
> > [3]
> > https://github.com/tpressure/qemu/commit/d64c9d5e76f3f3b747bea7653d677bd61e13aafe
> > 
> > Signed-off-by: Julian Stecklina
> > <julian.stecklina@...erus-technology.de>
> > Signed-off-by: Thomas Prescher
> > <thomas.prescher@...erus-technology.de>
> 
> SoB is reversed, yours should come after Thomas'.
> 
> > ---
> >  arch/x86/kvm/vmx/vmx.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 6780313914f8..0d4af00245f3 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -3474,7 +3474,11 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu,
> > unsigned long cr4)
> >  			hw_cr4 &= ~(X86_CR4_SMEP | X86_CR4_SMAP |
> > X86_CR4_PKE);
> >  	}
> >  
> > -	vmcs_writel(CR4_READ_SHADOW, cr4);
> > +	if (is_guest_mode(vcpu))
> > +		vmcs_writel(CR4_READ_SHADOW,
> > nested_read_cr4(get_vmcs12(vcpu)));
> > +	else
> > +		vmcs_writel(CR4_READ_SHADOW, cr4);
> > +
> >  	vmcs_writel(GUEST_CR4, hw_cr4);
> >  
> >  	if ((cr4 ^ old_cr4) & (X86_CR4_OSXSAVE | X86_CR4_PKE))
> > -- 
> > 2.43.2
> > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ