linux-kernel - Re: CONFIG_X86_HYPERVISOR (was: Re: [PATCH AUTOSEL 5.10 2/6] x86/cpu: Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in a virtual machine)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aAwj_Tkqj4GtywDe@google.com>
Date: Fri, 25 Apr 2025 17:08:29 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Pavel Machek <pavel@...x.de>, Sasha Levin <sashal@...nel.org>, linux-kernel@...r.kernel.org, 
	stable@...r.kernel.org, Max Grobecker <max@...becker.info>, Ingo Molnar <mingo@...nel.org>, 
	tglx@...utronix.de, mingo@...hat.com, dave.hansen@...ux.intel.com, 
	x86@...nel.org, thomas.lendacky@....com, perry.yuan@....com, 
	mario.limonciello@....com, riel@...riel.com, mjguzik@...il.com, 
	darwi@...utronix.de, Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: CONFIG_X86_HYPERVISOR (was: Re: [PATCH AUTOSEL 5.10 2/6] x86/cpu:
 Don't clear X86_FEATURE_LAHF_LM flag in init_amd_k8() on AMD when running in
 a virtual machine)

On Thu, Apr 24, 2025, Borislav Petkov wrote:
> On Thu, Apr 24, 2025 at 12:18:50PM -0700, Sean Christopherson wrote:
> > Not quite.  KVM supports all of those seamlessly, with some caveats.  E.g. if
> > host userspace and guest kernel are trying to use the same DRx, the guest will
> > "lose" and not get its #DBs.
> 
> Pff, so cloud providers have big fat signs over their workstations
> saying: you're not allowed to use breakpoints on production systems?

Heh, it's a bit more than a sign.

> With my silly thinking, I'd prefer to reglement this more explicitly and
> actually have the kernel enforce policy:

The kernel already can enforce policy.  Setting host breakpoints on guest code
is done through a dedicated ioctl(), and access to said ioctl() can be restricted
through various sandboxing methods, e.g. seccomp.

> HV userspace has higher prio with #DB or guests do. But the "losing" bit
> sounds weird and not nice.

Yeah, it's weird and not nice.  But if a human is interactive debugging a guest,
odds are very, very good that a missing breakpoint in the guest is not at all a
concern.

> > Definitely not.  All I was thinking was something like:
> > 
> > diff --git a/arch/x86/include/asm/debugreg.h b/arch/x86/include/asm/debugreg.h
> > index fdbbbfec745a..a218c5170ecd 100644
> > --- a/arch/x86/include/asm/debugreg.h
> > +++ b/arch/x86/include/asm/debugreg.h
> > @@ -121,7 +121,7 @@ static __always_inline unsigned long local_db_save(void)
> >  {
> >         unsigned long dr7;
> >  
> > -       if (static_cpu_has(X86_FEATURE_HYPERVISOR) && !hw_breakpoint_active())
> > +       if (static_cpu_has(X86_FEATURE_DRS_MAY_VMEXIT) && !hw_breakpoint_active())
> >                 return 0;
> >  
> >         get_debugreg(dr7, 7);
> > 
> > Where X86_FEATURE_DRS_MAY_VMEXIT is set if HYPERVISOR is detected, but then
> > cleared by SEV-ES+ and TDX guests with guaranteed access to DRs.  That said,
> > even that much infrastructure probably isn't worth the marginal benefits.
> 
> Btw you can replace that X86_FEATURE_DRS_MAY_VMEXIT with a cc_platform
> flag which gets properly set on all those coco guest types as those
> flags are exactly for that stuff.

No, that would defeat the purpose of the check.  The X86_FEATURE_HYPERVISOR has
nothing to do with correctness, it's all about performance.  Critically, it's a
static check that gets patched at runtime.  It's a micro-optimization for bare
metal to avoid a single cache miss (the __this_cpu_read(cpu_dr7)).  Routing
through cc_platform_has() would be far, far heavier than calling hw_breakpoint_active().

I pointed out the SEV-ES+/TDX cases because they likely would benefit from that
same micro-optimization, i.e. by avoiding the call to hw_breakpoint_active().