lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <SN6PR02MB4157194E753702D204C20D09D4862@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Sun, 27 Apr 2025 03:58:54 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Michael Kelley <mhklinux@...look.com>, Peter Zijlstra
	<peterz@...radead.org>
CC: "x86@...nel.org" <x86@...nel.org>, "kys@...rosoft.com"
	<kys@...rosoft.com>, "haiyangz@...rosoft.com" <haiyangz@...rosoft.com>,
	"wei.liu@...nel.org" <wei.liu@...nel.org>, "decui@...rosoft.com"
	<decui@...rosoft.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>, "hpa@...or.com"
	<hpa@...or.com>, "jpoimboe@...nel.org" <jpoimboe@...nel.org>,
	"pawan.kumar.gupta@...ux.intel.com" <pawan.kumar.gupta@...ux.intel.com>,
	"seanjc@...gle.com" <seanjc@...gle.com>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "ardb@...nel.org" <ardb@...nel.org>, "kees@...nel.org"
	<kees@...nel.org>, Arnd Bergmann <arnd@...db.de>,
	"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "linux-efi@...r.kernel.org"
	<linux-efi@...r.kernel.org>, "samitolvanen@...gle.com"
	<samitolvanen@...gle.com>, "ojeda@...nel.org" <ojeda@...nel.org>
Subject: RE: [PATCH 5/6] x86_64,hyperv: Use direct call to hypercall-page

From: Michael Kelley <mhklinux@...look.com> Sent: Friday, April 25, 2025 7:32 AM
> 
> From: Peter Zijlstra <peterz@...radead.org> Sent: Friday, April 25, 2025 7:04 AM
> >
> > On Mon, Apr 21, 2025 at 06:28:42PM +0000, Michael Kelley wrote:
> >
> > > >  #ifdef CONFIG_X86_64
> > > > +static u64 __hv_hyperfail(u64 control, u64 param1, u64 param2)
> > > > +{
> > > > +	return U64_MAX;
> > > > +}
> > > > +
> > > > +DEFINE_STATIC_CALL(__hv_hypercall, __hv_hyperfail);
> > > > +
> > > >  u64 hv_pg_hypercall(u64 control, u64 param1, u64 param2)
> > > >  {
> > > >  	u64 hv_status;
> > > >
> > > > +	asm volatile ("call " STATIC_CALL_TRAMP_STR(__hv_hypercall)
> > > >  		      : "=a" (hv_status), ASM_CALL_CONSTRAINT,
> > > >  		        "+c" (control), "+d" (param1)
> > > > +		      : "r" (__r8)
> > > >  		      : "cc", "memory", "r9", "r10", "r11");
> > > >
> > > >  	return hv_status;
> > > >  }
> > > > +
> > > > +typedef u64 (*hv_hypercall_f)(u64 control, u64 param1, u64 param2);
> > > > +
> > > > +static inline void hv_set_hypercall_pg(void *ptr)
> > > > +{
> > > > +	hv_hypercall_pg = ptr;
> > > > +
> > > > +	if (!ptr)
> > > > +		ptr = &__hv_hyperfail;
> > > > +	static_call_update(__hv_hypercall, (hv_hypercall_f)ptr);
> > > > +}
> >
> > ^ kept for reference, as I try and explain how static_call() works
> > below.
> >
> > > > -skip_hypercall_pg_init:
> > > > -	/*
> > > > -	 * Some versions of Hyper-V that provide IBT in guest VMs have a bug
> > > > -	 * in that there's no ENDBR64 instruction at the entry to the
> > > > -	 * hypercall page. Because hypercalls are invoked via an indirect call
> > > > -	 * to the hypercall page, all hypercall attempts fail when IBT is
> > > > -	 * enabled, and Linux panics. For such buggy versions, disable IBT.
> > > > -	 *
> > > > -	 * Fixed versions of Hyper-V always provide ENDBR64 on the hypercall
> > > > -	 * page, so if future Linux kernel versions enable IBT for 32-bit
> > > > -	 * builds, additional hypercall page hackery will be required here
> > > > -	 * to provide an ENDBR32.
> > > > -	 */
> > > > -#ifdef CONFIG_X86_KERNEL_IBT
> > > > -	if (cpu_feature_enabled(X86_FEATURE_IBT) &&
> > > > -	    *(u32 *)hv_hypercall_pg != gen_endbr()) {
> > > > -		setup_clear_cpu_cap(X86_FEATURE_IBT);
> > > > -		pr_warn("Disabling IBT because of Hyper-V bug\n");
> > > > -	}
> > > > -#endif
> > >
> > > With this patch set, it's nice to see IBT working in a Hyper-V guest!
> > > I had previously tested IBT with some hackery to the hypercall page
> > > to add the missing ENDBR64, and didn't see any problems. Same
> > > after these changes -- no complaints from IBT.
> >
> > No indirect calls left, no IBT complaints ;-)
> >
> > > > +	hv_set_hypercall_pg(hv_hypercall_pg);
> > > >
> > > > +skip_hypercall_pg_init:
> > > >  	/*
> > > >  	 * hyperv_init() is called before LAPIC is initialized: see
> > > >  	 * apic_intr_mode_init() -> x86_platform.apic_post_init() and
> > > > @@ -658,7 +658,7 @@ void hyperv_cleanup(void)
> > > >  	 * let hypercall operations fail safely rather than
> > > >  	 * panic the kernel for using invalid hypercall page
> > > >  	 */
> > > > -	hv_hypercall_pg = NULL;
> > > > +	hv_set_hypercall_pg(NULL);
> > >
> > > This causes a hang getting into the kdump kernel after a panic.
> > > hyperv_cleanup() is called after native_machine_crash_shutdown()
> > > has done crash_smp_send_stop() on all the other CPUs. I don't know
> > > the details of how static_call_update() works,
> >
> > Right, so let me try and explain this :-)
> >
> > So we get the compiler to emit direct calls (CALL/JMP) to symbols
> > prefixed with "__SCT__", in this case from asm, but more usually by
> > means of the static_call() macro mess.
> >
> > Meanwhile DEFINE_STATIC_CALL() ensures such a symbol actually exists.
> > This symbol is a little trampoline that redirects to the actual
> > target function given to DEFINE_STATIC_CALL() -- __hv_hyperfail() in the
> > above case.
> >
> > Then objtool runs through the resulting object file and stores the
> > location of every call to these __STC__ prefixed symbols in a custom
> > section.
> >
> > This enables static_call init (boot time) to go through the section and
> > rewrite all the trampoline calls to direct calls to the target.
> > Subsequent static_call_update() calls will again rewrite the direct call
> > to point elsewhere.
> >
> > So very much how static_branch() does a NOP/JMP rewrite to toggle
> > branches, static_call() rewrites (direct) call targets.
> >
> > > but it's easy to imagine that
> > > it wouldn't work when the kernel is in such a state.
> > >
> > > The original code setting hv_hypercall_pg to NULL is just tidiness.
> > > Other CPUs are stopped and can't be making hypercalls, and this CPU
> > > shouldn't be making hypercalls either, so setting it to NULL more
> > > cleanly catches some erroneous hypercall (vs. accessing the hypercall
> > > page after Hyper-V has been told to reset it).
> >
> > So if you look at (retained above) hv_set_hypercall_pg(), when given
> > NULL, the call target is set to __hv_hyperfail(), which does an
> > unconditional U64_MAX return.
> >
> > Combined with the fact that the thing *should* not be doing hypercalls
> > anymore at this point, something is iffy.
> >
> > I can easily remove it, but it *should* be equivalent to before, where
> > it dynamicall checked for hv_hypercall_pg being NULL.
> 
> I agree that setting the call target to __hv_hyperfail() should be good.
> But my theory is that static_call_update() is hanging when trying to
> do the rewrite, because of the state of the other CPUs. I don't think
> control is ever returning from static_call_update() when invoked
> through hyperv_cleanup(). Wouldn't static_call_update() need to park
> the other CPUs temporarily and/or flush instruction caches to make
> everything consistent?
> 
> But that's just my theory. I'll run a few more experiments to confirm
> if control ever returns from static_call_update() in this case.
> 

Indeed, control never returns from static_call_update(). Prior to
hyperv_cleanup() running, crash_smp_send_stop() has been called to
stop all the other CPUs, and it does not update cpu_online_mask to
reflect the other CPUs being stopped.

static_call_update() runs this call sequence:

arch_static_call_transform()
__static_call_transform()
smp_text_poke_single()
smp_text_poke_batch_finish()
smp_text_poke_sync_each_cpu()

smp_text_poke_sync_each_cpu() sends an IPI to each CPU in
cpu_online_mask, and of course the other CPUs never respond, so
it waits forever.

Michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ