[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DDB94D59-714C-4747-B67F-6C7424D068A1@infradead.org>
Date: Tue, 21 Feb 2023 07:16:48 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Kim Phillips <kim.phillips@....com>,
Oleksandr Natalenko <oleksandr@...alenko.name>
CC: tglx@...utronix.de, Usama Arif <usama.arif@...edance.com>,
arjan@...ux.intel.com, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, x86@...nel.org,
pbonzini@...hat.com, paulmck@...nel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
rcu@...r.kernel.org, mimoja@...oja.de, hewenliang4@...wei.com,
thomas.lendacky@....com, seanjc@...gle.com, pmenzel@...gen.mpg.de,
fam.zheng@...edance.com, punit.agrawal@...edance.com,
simon.evans@...edance.com, liangma@...ngbit.com,
Piotr Gorski <lucjan.lucjanov@...il.com>,
"Limonciello, Mario" <Mario.Limonciello@....com>
Subject: Re: [PATCH v9 0/8] Parallel CPU bringup for x86_64
On 21 February 2023 04:20:41 GMT, Kim Phillips <kim.phillips@....com> wrote:
>On 2/20/23 5:30 PM, David Woodhouse wrote:
>> On Mon, 2023-02-20 at 17:23 -0600, Kim Phillips wrote:
>>> On 2/20/23 3:39 PM, David Woodhouse wrote:
>>>> On 20 February 2023 21:23:38 GMT, Oleksandr Natalenko <oleksandr@...alenko.name> wrote:
>>>>> On 20.02.2023 21:31, David Woodhouse wrote:
>>>>>> On Mon, 2023-02-20 at 17:40 +0100, Oleksandr Natalenko wrote:
>>>>>>> On pondělí 20. února 2023 17:20:13 CET David Woodhouse wrote:
>>>>>>>> On Mon, 2023-02-20 at 17:08 +0100, Oleksandr Natalenko wrote:
>>>>>>>>>
>>>>>>>>> I've applied this to the v6.2 kernel, and suspend/resume broke on
>>>>>>>>> my
>>>>>>>>> Ryzen 5950X desktop. The machine suspends just fine, but on
>>>>>>>>> resume
>>>>>>>>> the screen stays blank, and there's no visible disk I/O.
>>>>>>>>>
>>>>>>>>> Reverting the series brings suspend/resume back to working state.
>>>>>>>>
>>>>>>>> Hm, thanks. What if you add 'no_parallel_bringup' on the command
>>>>>>>> line?
>>>>>>>
>>>>>>> If the `no_parallel_bringup` param is added, the suspend/resume
>>>>>>> works.
>>>>>>
>>>>>> Thanks for the testing. Can I ask you to do one further test: apply the
>>>>>> series only as far as patch 6/8 'x86/smpboot: Support parallel startup
>>>>>> of secondary CPUs'.
>>>>>>
>>>>>> That will do the new startup asm sequence where each CPU finds its own
>>>>>> per-cpu data so it *could* work in parallel, but doesn't actually do
>>>>>> the bringup in parallel yet.
>>>>>
>>>>> With patches 1 to 6 (including) applied and no extra cmdline
>>>>> params added the resume doesn't work.
>>>>
>>>> Hm. Kim, is there some weirdness with the way AMD CPUs get their
>>>> APIC ID in CPUID 0x1? Especially after resume?
>>>
>>> Not to my knowledge. Mario?
>
>I tested v9-up-to-6/8 on a Ryzen 3000 that passed your between-v6 & v7
>tree commits (ce7e2d1e046a for the parallel-6.2-rc6-part1 tag
>and 17bbd12ee03 for parallel-6.2-rc6), and it, too, fails to resume
>v9-up-to-6/8 after suspend.
>
>> Oleksandr, please could you show the output of 'cpuid' after a
>> successful resume? I'm particularly looking for this part...
>>
>>
>> $ sudo cpuid | grep -A1 1/ebx
>> miscellaneous (1/ebx):
>> process local APIC physical ID = 0x0 (0)
>> --
>> miscellaneous (1/ebx):
>> process local APIC physical ID = 0x2 (2)
>> ...
>
>The Ryzens have a different pattern it seems:
>
>$ sudo cpuid | grep -A1 \(1/ebx
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x0 (0)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x1 (1)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x2 (2)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x3 (3)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x4 (4)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x5 (5)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x6 (6)
>--
> miscellaneous (1/ebx):
> process local APIC physical ID = 0x7 (7)
>
>
>I tested the v7 series on Ryzen, it also fails, so
>Ryzen users were last known good with those two
>aforementioned commits on your tree:
>
>git://git.infradead.org/users/dwmw2/linux.git
That was when it was only using (and validating) CPUID 0xB and never trusting CPUID 0x1, right?
Powered by blists - more mailing lists