lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <02a80f45-496e-41dd-b17e-819ca82f27c5@amd.com>
Date:   Wed, 25 Oct 2023 14:04:27 -0500
From:   Mario Limonciello <mario.limonciello@....com>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     Tom Lendacky <thomas.lendacky@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sandipan Das <sandipan.das@....com>,
        "H . Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
        x86@...nel.org, linux-pm@...r.kernel.org, rafael@...nel.org,
        pavel@....cz, linux-perf-users@...r.kernel.org,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Ian Rogers <irogers@...gle.com>,
        Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [PATCH 1/2] x86: Enable x2apic during resume from suspend if used
 previously

On 10/24/2023 12:01, Ingo Molnar wrote:
> 
> * Mario Limonciello <mario.limonciello@....com> wrote:
> 
>> +Tom
>>
>> On 10/24/2023 03:36, Ingo Molnar wrote:
>>>
>>> * Mario Limonciello <mario.limonciello@....com> wrote:
>>>
>>>> If x2apic was enabled during boot with parallel startup
>>>> it will be needed during resume from suspend to ram as well.
>>>>
>>>> Store whether to enable into the smpboot_control global variable
>>>> and during startup re-enable it if necessary.
>>>>
>>>> Cc: stable@...r.kernel.org # 6.5+
>>>> Fixes: 0c7ffa32dbd6 ("x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it")
>>>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>>>> ---
>>>>    arch/x86/include/asm/smp.h   |  1 +
>>>>    arch/x86/kernel/acpi/sleep.c | 12 ++++++++----
>>>>    arch/x86/kernel/head_64.S    | 15 +++++++++++++++
>>>>    3 files changed, 24 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
>>>> index c31c633419fe..86584ffaebc3 100644
>>>> --- a/arch/x86/include/asm/smp.h
>>>> +++ b/arch/x86/include/asm/smp.h
>>>> @@ -190,6 +190,7 @@ extern unsigned long apic_mmio_base;
>>>>    #endif /* !__ASSEMBLY__ */
>>>>    /* Control bits for startup_64 */
>>>> +#define STARTUP_ENABLE_X2APIC	0x40000000
>>>>    #define STARTUP_READ_APICID	0x80000000
>>>>    /* Top 8 bits are reserved for control */
>>>> diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
>>>> index 6dfecb27b846..29734a1299f6 100644
>>>> --- a/arch/x86/kernel/acpi/sleep.c
>>>> +++ b/arch/x86/kernel/acpi/sleep.c
>>>> @@ -11,6 +11,7 @@
>>>>    #include <linux/dmi.h>
>>>>    #include <linux/cpumask.h>
>>>>    #include <linux/pgtable.h>
>>>> +#include <asm/apic.h>
>>>>    #include <asm/segment.h>
>>>>    #include <asm/desc.h>
>>>>    #include <asm/cacheflush.h>
>>>> @@ -129,11 +130,14 @@ int x86_acpi_suspend_lowlevel(void)
>>>>    	 */
>>>>    	current->thread.sp = (unsigned long)temp_stack + sizeof(temp_stack);
>>>>    	/*
>>>> -	 * Ensure the CPU knows which one it is when it comes back, if
>>>> -	 * it isn't in parallel mode and expected to work that out for
>>>> -	 * itself.
>>>> +	 * Ensure x2apic is re-enabled if necessary and the CPU knows which
>>>> +	 * one it is when it comes back, if it isn't in parallel mode and
>>>> +	 * expected to work that out for itself.
>>>>    	 */
>>>> -	if (!(smpboot_control & STARTUP_PARALLEL_MASK))
>>>> +	if (smpboot_control & STARTUP_PARALLEL_MASK) {
>>>> +		if (x2apic_enabled())
>>>> +			smpboot_control |= STARTUP_ENABLE_X2APIC;
>>>> +	} else
>>>>    		smpboot_control = smp_processor_id();
>>>
>>> Yeah, so instead of adding further kludges to the 'parallel bringup is
>>> possible' code path, which is arguably a functional feature that shouldn't
>>> have hardware-management coupled to it, would it be possible to fix
>>> parallel bringup to AMD-SEV systems, so that this code path isn't a
>>> quirk-dependent "parallel boot" codepath, but simply the "x86 SMP boot
>>> codepath", where all SMP x86 systems do a parallel bootup?
>>>
>>> The original commit by Thomas says:
>>>
>>>     0c7ffa32dbd6 ("x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it")
>>>
>>>     | Unfortunately there is no RDMSR GHCB protocol at the moment, so enabling
>>>     | AMD-SEV guests for parallel startup needs some more thought.
>>>
>>> But that was half a year ago, isn't there RDMSR GHCB access code available now?
>>>
>>> This code would all read a lot more natural if it was the regular x86 SMP
>>> bootup path - which it is 'almost' today already, modulo quirk.
>>>
>>> Obviously coupling functional features with hardware quirks is fragile, for
>>> example your patch extending x86 SMP parallel bringup doesn't extend the
>>> AMD-SEV case, which may or may not matter in practice.
>>>
>>> So, if it's possible, it would be nice to fix AMD-SEV systems as well and
>>> remove this artificial coupling.
>>
>> It probably isn't clear since I didn't mention it in the commit message, but
>> this is not a system that supports AMD-SEV.  This is a workstation that
>> supports x2apic.  I'll clarify that for V2.
> 
> Yes, I suspected as much, but that's irrelevant to the arguments I
> outlined, that extending upon this quirk that makes SMP parallel bringup HW
> environment dependent, and then coupling s2ram x2apic re-enablement to that
> functional feature is inviting trouble in the long run.
> 

I spent some more time looking at ways to decouple this, and AFAICT 
thaw_secondary_cpus() doesn't actually bring CPUs back after resume in 
parallel mode.

To be symmetrical with that, another way to solve this that removes the 
"HW environment" aspect is to disable parallel boot for resume from 
sleep entirely.

Like this:

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 6dfecb27b846..9265d97f497b 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -128,13 +128,12 @@ int x86_acpi_suspend_lowlevel(void)
          * value is in the actual %rsp register.
          */
         current->thread.sp = (unsigned long)temp_stack + 
sizeof(temp_stack);
-       /*
-        * Ensure the CPU knows which one it is when it comes back, if
-        * it isn't in parallel mode and expected to work that out for
-        * itself.
+       /*
+        * Don't use parallel startup for resume from sleep. This avoids
+        * hangs that may occur if x2apic was in use but platform
+        * has not enabled x2apic on it's own after resume.
          */
-       if (!(smpboot_control & STARTUP_PARALLEL_MASK))
-               smpboot_control = smp_processor_id();
+       smpboot_control = smp_processor_id();
  #endif
         initial_code = (unsigned long)wakeup_long64;
         saved_magic = 0x123456789abcdef0L;


> For example, what guarantees that the x2apic will be turned back on after
> suspend if a system is booted with maxcpus=1?

lapic_resume() will do this after the boot CPU makes it up.

> 
> Obviously something very close to your fix is needed.
> 

Given lapic_resume() handles this, I'd think with the style fixups you 
suggested my patch is appropriate.

>> I've looped Tom in to comment whether it's possible to improve AMD-SEV as
>> well.
> 
> Thanks!
> 
> 	Ingo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ