[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <23e2c13f-7ff9-4336-97ae-088ec4401edf@amd.com>
Date: Wed, 8 May 2024 17:09:32 -0500
From: Mario Limonciello <mario.limonciello@....com>
To: Thomas Gleixner <tglx@...utronix.de>, Lyude Paul <lyude@...hat.com>,
Borislav Petkov <bp@...en8.de>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: Early boot regression from f0551af0213 ("x86/topology: Ignore
non-present APIC IDs in a present package")
On 5/8/2024 16:47, Thomas Gleixner wrote:
> Mario!
>
> On Thu, May 02 2024 at 05:33, Mario Limonciello wrote:
>> On 4/25/2024 16:42, Thomas Gleixner wrote:
>>> Right, that's what we saw with the debug patch. The ACPI/MADT table
>>> is clearly bonkers. The effect of it is that it pretends that the system
>>> has 16 possible CPUs:
>>>
>>> [ 0.089381] CPU topo: Allowing 8 present CPUs plus 8 hotplug CPUs
>>>
>>> Which in turn changes the sizing of the per CPU data and affects some
>>> other details which depend on the number of possible CPUs.
>>
>> At least this aspect of this I suspect is caused by commit
>> fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c.
>>
>> If you try reverting that I expect the "hotplug CPUs" disappear.
>
> That does not solve anything.
>
> The topology core already rejects those CPUs and accounts only for 8,
> which in turn causes the boot to fail as also demonstrated by limiting
> the number of possible CPUs to 8.
>
> There is some other problem with this broken BIOS/ACPI.
Something very commonly done in BIOSes on AMD systems is that the FADT
has "entries" for the maximum number of CPUs that can be present. For
example if the system can support up to 12 cores and you buy an 8 core
vs 12 core the BIOS will have the same number of entries (probably 24
considering SMT) either way. In the case of 8 cores only 16 would end
up populated.
Looking at Lyude's logs that system is from before ACPI 6.3 was even
introduced so that's why I was suggesting that reverting that commit
might help at least the kernel claiming that it saw a number of hotplug
CPUs.
But yes, I agree it probably won't help the overall issue that started
this thread.
Powered by blists - more mailing lists