lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7c3a03e-6f52-4bcf-aaae-668f6601ceb8@amd.com>
Date:   Tue, 31 Oct 2023 13:53:34 -0500
From:   Mario Limonciello <mario.limonciello@....com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        Linux kernel regressions list <regressions@...ts.linux.dev>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Ian Rogers <irogers@...gle.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        "Rafael J . Wysocki" <rafael@...nel.org>,
        Len Brown <len.brown@...el.com>, Pavel Machek <pavel@....cz>,
        David Woodhouse <dwmw@...zon.co.uk>,
        Sandipan Das <sandipan.das@....com>,
        "open list:PERFORMANCE EVENTS SUBSYSTEM" 
        <linux-perf-users@...r.kernel.org>,
        "open list:PERFORMANCE EVENTS SUBSYSTEM" 
        <linux-kernel@...r.kernel.org>,
        "open list:SUSPEND TO RAM" <linux-pm@...r.kernel.org>,
        "open list:ACPI" <linux-acpi@...r.kernel.org>
Subject: Re: [PATCH v2 1/2] x86: Enable x2apic during resume from suspend if
 used previously

On 10/27/2023 16:31, Thomas Gleixner wrote:
> Mario!
> 
> On Thu, Oct 26 2023 at 12:03, Mario Limonciello wrote:
> 
>> If x2apic was enabled during boot with parallel startup
>> it will be needed during resume from suspend to ram as well.
> 
> Lacks an explanation why it is needed.
> 
>> Store whether to enable into the smpboot_control global variable
>> and during startup re-enable it if necessary.
>>
>> This fixes resume from suspend on workstation CPUs with x2apic
>> enabled.
> 
> You completely fail to describe the failure mode.
> 
>> It will also work on systems with one maxcpus=1 but still using
>> x2apic since x2apic is also re-enabled in lapic_resume().
> 
> This sentence makes no sense. What's so special about maxcpus=1?
> 
> Absolutely nothing.
> 
> lapic_resume() is a syscore op and is always invoked on the CPU which
> handles suspend/resume. The point is that __x2apic_enable() which is
> invoked from there becomes a NOOP because X2APIC is already enabled.
> 
> So what are you trying to tell me?
> 
> I really appreciate your enthusiasm of chasing and fixing bugs, but your
> change logs and explanations are really making it hard to keep that
> appreciation up.
> 
>>   	/*
>> -	 * Ensure the CPU knows which one it is when it comes back, if
>> -	 * it isn't in parallel mode and expected to work that out for
>> -	 * itself.
>> +	 * Ensure x2apic is re-enabled if necessary and the CPU knows which
>> +	 * one it is when it comes back, if it isn't in parallel mode and
>> +	 * expected to work that out for itself.
> 
> The x2apic comment is misplaced. It should be above the x2apic
> conditional as it has nothing to do with the initial condition because
> even if X2APIC is enabled then the parallel startup might be disabled.
> 
> Go and read this comment 3 month from now and try to make sense of it.
> 
>>   	 */
>> -	if (!(smpboot_control & STARTUP_PARALLEL_MASK))
>> +	if (smpboot_control & STARTUP_PARALLEL_MASK) {
>> +		if (x2apic_enabled())
>> +			smpboot_control |= STARTUP_ENABLE_X2APIC;
> 
> This bit is sticky after resume, so any subsequent CPU hotplug operation
> will see it as well.
> 
> This lacks an explanation why this is correct and why this flag is not
> set early during boot before the APs are brought up.
> 
>> +	} else {
>>   		smpboot_control = smp_processor_id();
>> +	}
>>   #endif
>>   	initial_code = (unsigned long)wakeup_long64;
>>   	saved_magic = 0x123456789abcdef0L;
>> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
>> index ea6995920b7a..300901af9fa3 100644
>> --- a/arch/x86/kernel/head_64.S
>> +++ b/arch/x86/kernel/head_64.S
>> @@ -237,9 +237,14 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
>>   	 * CPU number is encoded in smpboot_control.
>>   	 *
>>   	 * Bit 31	STARTUP_READ_APICID (Read APICID from APIC)
>> +	 * Bit 30	STARTUP_ENABLE_X2APIC (Enable X2APIC mode)
>>   	 * Bit 0-23	CPU# if STARTUP_xx flags are not set
>>   	 */
>>   	movl	smpboot_control(%rip), %ecx
>> +
>> +	testl	$STARTUP_ENABLE_X2APIC, %ecx
>> +	jnz	.Lenable_x2apic
> 
> Why is this tested here? The test clearly belongs into .Lread_apicid
> 
>> +
>>   	testl	$STARTUP_READ_APICID, %ecx
>>   	jnz	.Lread_apicid
>>   	/*
>> @@ -249,6 +254,16 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
>>   	andl	$(~STARTUP_PARALLEL_MASK), %ecx
>>   	jmp	.Lsetup_cpu
>>   
>> +.Lenable_x2apic:
>> +	/* Enable X2APIC if disabled */
>> +	mov	$MSR_IA32_APICBASE, %ecx
>> +	rdmsr
>> +	testl	$X2APIC_ENABLE, %eax
>> +	jnz	.Lread_apicid_msr
>> +	orl	$X2APIC_ENABLE, %eax
>> +	wrmsr
>> +	jmp	.Lread_apicid_msr
> 
> And this part just moves in front of .Lread_apicid_msr and spares
> the jump at the end.
> 
>>   .Lread_apicid:
>>   	/* Check whether X2APIC mode is already enabled */
>>   	mov	$MSR_IA32_APICBASE, %ecx
> 
> That aside, I'm still failing to see the actual failure scenario due to
> the utter void in the change log.
> 
> The kernel has two mechanisms which end up with X2APIC enabled:
> 
>      1) BIOS has it enabled already, which is required for any machine
>         which has more than 255 CPUs, but BIOS can decide to enable
>         X2APIC even with less than 256 CPUs.
> 
>      2) BIOS has not enabled it, but the kernel enables it because the
>         CPU supports it.
> 
> The major difference is:
> 
>      #1 prevents the MMIO fixmap for the APIC to be installed
> 
>      #2 installs the fixmap but does not use it. It's never torn down.
> 
> Let's look at these two cases in the light of resume:
> 
>      #1 If the BIOS enabled X2APIC mode then the only way how this can
>         fail on resume is that the BIOS did not enable X2APIC mode in the
>         resume path before going back into the kernel and due to the
>         non-existent MMIO mapping there is no way to read the APIC ID.
> 
>      #2 It does not matter whether the BIOS enabled X2APIC mode during
>         resume because the MMIO mapping exists and the APIC ID read via
>         MMIO should be identical to the APIC ID read via the X2APIC MSR.
> 
>         If not, then there is something fundamentally wrong.
> 
> How is this working during the initial bringup of the APs?
> 
>      #1 If the APs do not have X2APIC enabled by the BIOS then they will
>         crash and burn during bringup due to non-existent MMIO mapping.
> 
>      #2 The APs can read their APIC ID just fine via MMIO and it
>         obviously is the same as the APIC ID after the bringup enabled
>         X2APIC mode. Otherwise the kernel would not work at all.
> 
> So the only reasonable explanation for the failure is that the BIOS
> fails to reenable X2APIC mode on resume either on the CPU which handles
> suspend/resume or on the subsequent AP bringups, which is not clear at
> all due to the bit being sticky and the changelog being full of void in
> that aspect.
> 
> That said the sticky bit is wrong for the following case with older CPUs
> where X2APIC requires up to date microcode loaded to work correctly:
> 
>      boot maxcpus=4
>        load microcode    // Same sequence applies for AP (CPU1-3)
>        enable x2apic
>      suspend
>         set X2APIC enable bit in smpboot_control
>      resume
>      bringup CPU4
>         enable X2APIC early --> fail due to lack of microcode fix
> 
> Whether this affects the APIC ID readout or not, which we don't know and
> even if we consider this case to be academic, it's still fundamentally
> wrong from a correctness point of view.
> 
> So without proper information about the failure scenario this "fix" is
> simply going nowhere.
> 
> Please provide the following information:
> 
>    - dmesg of the initial boot up to 'smp: Bringing up secondary CPUs ...'
> 
>    - A proper description of the failure case:
> 
>      - Is the CPU which handles suspend/resume failing?
> 
>      - Is a subsequent AP bringup failing?
> 
>      - Is the failure due to the lack of MMIO mapping ?
> 
>      - Is the failure due to a bogus APIC ID retrieved via MMIO ?
> 
> Thanks for making me do your homework (again),
> 
>          tglx

Hi Thomas,

Thank you for looking this over.  I've reviewed it with the internal 
team and confirmed there was a BIOS bug where the MSR wasn't restored 
after the S3 cycle completed.  The BIOS team has fixed it.

Thanks,

#regzbot invalid: BIOS bug

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ