lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <23e2c13f-7ff9-4336-97ae-088ec4401edf@amd.com>
Date: Wed, 8 May 2024 17:09:32 -0500
From: Mario Limonciello <mario.limonciello@....com>
To: Thomas Gleixner <tglx@...utronix.de>, Lyude Paul <lyude@...hat.com>,
 Borislav Petkov <bp@...en8.de>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org
Subject: Re: Early boot regression from f0551af0213 ("x86/topology: Ignore
 non-present APIC IDs in a present package")

On 5/8/2024 16:47, Thomas Gleixner wrote:
> Mario!
> 
> On Thu, May 02 2024 at 05:33, Mario Limonciello wrote:
>> On 4/25/2024 16:42, Thomas Gleixner wrote:
>>> Right, that's what we saw with the debug patch. The ACPI/MADT table
>>> is clearly bonkers. The effect of it is that it pretends that the system
>>> has 16 possible CPUs:
>>>
>>>       [    0.089381] CPU topo: Allowing 8 present CPUs plus 8 hotplug CPUs
>>>
>>> Which in turn changes the sizing of the per CPU data and affects some
>>> other details which depend on the number of possible CPUs.
>>
>> At least this aspect of this I suspect is caused by commit
>> fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c.
>>
>> If you try reverting that I expect the "hotplug CPUs" disappear.
> 
> That does not solve anything.
> 
> The topology core already rejects those CPUs and accounts only for 8,
> which in turn causes the boot to fail as also demonstrated by limiting
> the number of possible CPUs to 8.
> 
> There is some other problem with this broken BIOS/ACPI.

Something very commonly done in BIOSes on AMD systems is that the FADT 
has "entries" for the maximum number of CPUs that can be present.  For 
example if the system can support up to 12 cores and you buy an 8 core 
vs 12 core the BIOS will have the same number of entries (probably 24 
considering SMT) either way.  In the case of 8 cores only 16 would end 
up populated.

Looking at Lyude's logs that system is from before ACPI 6.3 was even 
introduced so that's why I was suggesting that reverting that commit 
might help at least the kernel claiming that it saw a number of hotplug 
CPUs.

But yes, I agree it probably won't help the overall issue that started 
this thread.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ