linux-kernel - Re: BUG: Occasional unexpected DR6 value seen with nested virtualization on x86

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Z5GOFVFO6ocd1sli@google.com>
Date: Wed, 22 Jan 2025 16:32:21 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: John Stultz <jstultz@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, 
	Peter Zijlstra <peterz@...radead.org>, Frederic Weisbecker <fweisbec@...il.com>, 
	Andy Lutomirski <luto@...nel.org>, Borislav Petkov <bp@...e.de>, Jim Mattson <jmattson@...gle.com>, 
	"Alex Bennée" <alex.bennee@...aro.org>, Will Deacon <will@...nel.org>, 
	Thomas Gleixner <tglx@...utronix.de>, Dave Hansen <dave.hansen@...ux.intel.com>, 
	LKML <linux-kernel@...r.kernel.org>, kernel-team@...roid.com
Subject: Re: BUG: Occasional unexpected DR6 value seen with nested
 virtualization on x86

On Wed, Jan 22, 2025, John Stultz wrote:
> On Wed, Jan 22, 2025 at 12:55 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > On Tue, Jan 21, 2025, John Stultz wrote:
> > @@ -5043,6 +5041,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> >         .set_idt = svm_set_idt,
> >         .get_gdt = svm_get_gdt,
> >         .set_gdt = svm_set_gdt,
> > +       .set_dr6 = svm_set_dr6,
> 
> 
> Just fyi, to get this to build (svm_set_dr6 takes a *svm not a *vcpu)
> I needed to create a little wrapper to get the types right:
> 
> static void svm_set_dr6_vcpu(struct kvm_vcpu *vcpu, unsigned long value)
> {
>        struct vcpu_svm *svm = to_svm(vcpu);
>        svm_set_dr6(svm, value);
> }

Heh, yeah, I discovered as much when I tried to build wht my more generic kconfig.

> But otherwise, this looks like it has fixed the issue! I've not been
> able to trip a failure with the bionic ptrace test, nor with the debug
> test in kvm-unit-tests, both running in loops for several minutes.

FWIW, I ran the testcase in L2 for ~45 minutes and saw one failure ~3 minutes in,
but unfortunately I didn't have any tracing running so I have zero insight into
what went wrong.  I'm fairly certain the failure was due to running an unpatched
kernel in L1, i.e. that I hit the ultra-rare scenario where an L2=>L1 fastpath
exit between the #DB and read from DR6 clobbered hardware DR6.

For giggle and extra confidence, I hacked KVM to emulate HLT as a nop in the
fastpath, and verified failure (and the fix) in a non-nested setup with the below
selftest, on both AMD and Intel.

Sadly, KVM doesn't handle many exits in the fastpath on AMD, so having a regression
test that isn't Intel-specific isn't really possible at the momemnt.  I'm mildly
tempted to use testing as an excuse to handle some CPUID emulation in the fastpath,
as Linux userspace does a _lot_ of CPUID, e.g. a kernel build generates tens of
thousands of CPUID exits.

Anyways, this all makes me confident in the fix.  I'll post it properly tomorrow.

diff --git a/tools/testing/selftests/kvm/x86/debug_regs.c b/tools/testing/selftests/kvm/x86/debug_regs.c
index 2d814c1d1dc4..a34b65052f4e 100644
--- a/tools/testing/selftests/kvm/x86/debug_regs.c
+++ b/tools/testing/selftests/kvm/x86/debug_regs.c
@@ -22,11 +22,25 @@ extern unsigned char sw_bp, hw_bp, write_data, ss_start, bd_start;
 
 static void guest_code(void)
 {
+       unsigned long val = 0xffff0ffful;
+
        /* Create a pending interrupt on current vCPU */
        x2apic_enable();
        x2apic_write_reg(APIC_ICR, APIC_DEST_SELF | APIC_INT_ASSERT |
                         APIC_DM_FIXED | IRQ_VECTOR);
 
+       /*
+        * Debug Register Interception tests.
+        */
+       asm volatile("mov %%rax, %%dr6\n\t"
+                    "hlt\n\t"
+                    "mov %%dr6, %%rax\n\t"
+                    : "+r" (val));
+
+       __GUEST_ASSERT(val == 0xffff0ffful,
+                      "Wanted DR6 = 0xffff0ffful, got %lx\n", val);
+       GUEST_SYNC(0);
+
        /*
         * Software BP tests.
         *
@@ -103,6 +117,9 @@ int main(void)
        vm = vm_create_with_one_vcpu(&vcpu, guest_code);
        run = vcpu->run;
 
+       vcpu_run(vcpu);
+       TEST_ASSERT_EQ(get_ucall(vcpu, NULL), UCALL_SYNC);
+
        /* Test software BPs - int3 */
        memset(&debug, 0, sizeof(debug));
        debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;