lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251023181546.GA771720@yaz-khff2.amd.com>
Date: Thu, 23 Oct 2025 14:15:46 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: Michal Pecio <michal.pecio@...il.com>
Cc: Shyam-sundar.S-k@....com, bhelgaas@...gle.com, hdegoede@...hat.com,
	ilpo.jarvinen@...ux.intel.com, jdelvare@...e.com,
	linux-edac@...r.kernel.org, linux-hwmon@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
	linux@...ck-us.net, mario.limonciello@....com,
	naveenkrishna.chatradhi@....com,
	platform-driver-x86@...r.kernel.org, suma.hegde@....com,
	tony.luck@...el.com, x86@...nel.org
Subject: Re: [PATCH v3 06/12] x86/amd_nb: Use topology info to get AMD node
 count

On Thu, Oct 23, 2025 at 06:31:54PM +0200, Michal Pecio wrote:
> On Thu, 23 Oct 2025 12:09:06 -0400, Yazen Ghannam wrote:
> > On Thu, Oct 23, 2025 at 05:01:07PM +0200, Michal Pecio wrote:
> > > On Thu, 23 Oct 2025 09:59:35 -0400, Yazen Ghannam wrote:  
> > > > Thanks Michal.
> > > > 
> > > > I don't see anything obviously wrong.  
> > > 
> > > Which code is responsible for setting up those bitmaps which
> > > are counted by topology_init_possible_cpus()?
> > > 
> > > I guess I could add some printks there and reboot.
> > >   
> > 
> > The kernel seems to think there are 6 CPUs on your system:
> > 
> > [    0.072059] CPU topo: Allowing 4 present CPUs plus 2 hotplug CPUs
> 
> I thought this is because I have NR_CPUS set to 6, as this config
> originally came from the X6 machine, but I am not sure.
> 

I'm thinking we should look here: acpi_parse_lapic().

If you add printks in there, I think you'll see the extra CPUs get
registered as "not present" based on the table entries below.

> > 
> > We don't seem them enabled, but they may still get APIC IDs. If so, then
> > the IDs would be beyond the core shift of 2.
> > 
> > APIC IDs b'0 00 -> CPU0 on logical package 0
> > 	 b'0 01 -> CPU1 on logical package 0
> > 	 b'0 10 -> CPU2 on logical package 0
> > 	 b'0 11 -> CPU3 on logical package 0
> > 	 b'1 00 -> CPU0 on logical package 1
> > 	 b'1 01 -> CPU1 on logical package 1
> > 
> > 
> > Please try booting with "possible_cpus=4".
> 
> OK, will try it next time I'm rebooting.
> 
> > The "number of possible CPUs" comes from the ACPI Multiple APIC
> > Description Table (MADT). This has the signature "APIC".
> > 
> > Can you please provide the disassembly of this table?
> 
> Interesting, it looks like there are indeed 6 LAPICs there.
> BIOS bug? Attaching apic.dsl.
> 
> > Can you please share the dmesg output from that system? And the ACPI
> > table too?
> 
> Will try later but I don't recall any anomalies there.
> I remember checking the topology output and it made sense:
> 1 package, 1 die, 6 cores, 6 threads.

Thanks, yeah it's likely just fine since the topology matches.

[...]
> 
> [04Ch 0076   1]                Subtable Type : 00 [Processor Local APIC]
> [04Dh 0077   1]                       Length : 08
> [04Eh 0078   1]                 Processor ID : 05
> [04Fh 0079   1]                Local Apic ID : 84
> [050h 0080   4]        Flags (decoded below) : 00000000
>                            Processor Enabled : 0
>                       Runtime Online Capable : 0
> 
> [054h 0084   1]                Subtable Type : 00 [Processor Local APIC]
> [055h 0085   1]                       Length : 08
> [056h 0086   1]                 Processor ID : 06
> [057h 0087   1]                Local Apic ID : 85
> [058h 0088   4]        Flags (decoded below) : 00000000
>                            Processor Enabled : 0
>                       Runtime Online Capable : 0
> 

These APIC IDs seem bogus too. I'd expect them to be sequential, but
they jump to 84 and 85. It probably doesn't matter, though we could try
to use these as some secondary indicator that the entries should be
totally ignored.

I expect the 6-core will be sequential though.

I don't know if this is really a BIOS bug, because those entries are
indeed not enabled. This may have just been an optimization they used,
and it seemed to fit within the ambiguity of the ACPI spec at the time.

A quick solution would be to do a quirk for this. Though maybe we can
come up with a generic solution based on what we see so far.

Thanks,
Yazen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ