lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 30 Apr 2024 12:51:02 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Borislav Petkov <bp@...en8.de>
Cc: kernel test robot <oliver.sang@...el.com>, oe-lkp@...ts.linux.dev, lkp@...el.com, 
	linux-kernel@...r.kernel.org, x86@...nel.org, Ingo Molnar <mingo@...nel.org>, 
	Srikanth Aithal <sraithal@....com>
Subject: Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

On Tue, Apr 30, 2024, Borislav Petkov wrote:
> On Tue, Apr 30, 2024 at 11:40:14AM -0700, Sean Christopherson wrote:
> > Hmm, I don't think the problem is that init_ia32_feat_ctl() is called too late.
> > It too is called from the BSP prior to alternative_instructions():
> > 
> >   arch_cpu_finalize_init()
> >   |
> >   -> identify_boot_cpu()
> >      |
> >      -> identify_cpu()
> >         |
> >         -> .c_init() => init_intel()
> 
> Yeah, but look at the his stacktrace:
> 
> [ 0.055225][ T0] init_intel (arch/x86/include/asm/msr.h:146 arch/x86/include/asm/msr.h:300 arch/x86/kernel/cpu/intel.c:583
> +arch/x86/kernel/cpu/intel.c:687)
> [ 0.055225][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:1824)
> [ 0.055225][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:1949)
> [ 0.055225][ T0] smp_store_cpu_info (arch/x86/kernel/smpboot.c:333)
> 
> That's after alternatives.
>
> > Ah, and the WARN even specifically checks for the case where there's divergence
> > from the boot CPU:
> > 
> > 	if (boot_cpu_has(feature))
> > 		WARN_ON(alternatives_patched);
> 
> Funny you should mention that - I have this check in
> setup_force_cpu_cap() too which works on boot_cpu_data *BUT*, actually,
> the test in do_clear_cpu_cap() should be:
> 
>         if (c && cpu_has(c, feature))
>                 WARN_ON(alternatives_patched);
> 
> because setting a feature flag in *any* CPU's cap field is wrong after
> alternatives because as explained earlier.
> 
> I know, our feature flags handling is a major mess.

..

> my guess would be no and that init_ia32_feat_ctl() really needs to go
> before alternatives have been patched because it clears flags.

But that would just mask the underlying problem, it wouldn't actually fix anything
other than making the WARN go away.  Unless I'm misreading the splat+code, the
issue isn't that init_ia32_feat_ctl() clears VMX late, it's that the BSP sees
VMX as fully enabled, but at least one AP sees VMX as disabled.

I don't see how the kernel can expect to function correctly with divergent feature
support across CPUs, i.e. the WARN is a _good_ thing in this case, because it
alerts the user that their system is messed up, e.g. has a bad BIOS or something.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ