[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251023190429.GB796848@yaz-khff2.amd.com>
Date: Thu, 23 Oct 2025 15:04:29 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: Michal Pecio <michal.pecio@...il.com>
Cc: Shyam-sundar.S-k@....com, bhelgaas@...gle.com, hdegoede@...hat.com,
ilpo.jarvinen@...ux.intel.com, jdelvare@...e.com,
linux-edac@...r.kernel.org, linux-hwmon@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
linux@...ck-us.net, mario.limonciello@....com,
naveenkrishna.chatradhi@....com,
platform-driver-x86@...r.kernel.org, suma.hegde@....com,
tony.luck@...el.com, x86@...nel.org
Subject: Re: [PATCH v3 06/12] x86/amd_nb: Use topology info to get AMD node
count
On Thu, Oct 23, 2025 at 08:25:03PM +0200, Michal Pecio wrote:
> On Thu, 23 Oct 2025 12:09:06 -0400, Yazen Ghannam wrote:
> > The kernel seems to think there are 6 CPUs on your system:
> >
> > [ 0.072059] CPU topo: Allowing 4 present CPUs plus 2 hotplug CPUs
>
> I wonder if this code doesn't break systems which actually support
> hotplug, when some sockets aren't populated at boot?
I don't know about other vendors, but we don't do physical CPU hotplug.
CPU hotplug, in this case, is there are physical CPUs already in the
system, but they are not enabled for whatever policy decision. They
could be disabled in BIOS, and so the MADT entries will reflect that. Or
they can be disabled by kernel parameters.
I'm curious though: what happens if you try to 'online' these extra
CPUs?
>
>
> amd_northbridges.num = amd_num_nodes();
>
> This count apparently includes potential hotplug nodes.
Yes, but see below...
>
> for (i = 0; i < amd_northbridges.num; i++) {
> node_to_amd_nb(i)->misc = amd_node_get_func(i, 3);
>
> This is a wrapper around pci_get_domain_bus_and_slot(). Isn't this PCI
> device part of CPU "uncore" and absent until a CPU is hotplugged?
>
Yes, and no. This PCI device is the Northbridge and is inside the
package. These don't get hotplugged. If you had a 2P server, then both
would be there from boot time even if you left all the CPUs on one of
them "disabled". You can do this today if you have a 2P system and boot
with maxcpus=N or something similar.
> /*
> * Each Northbridge must have a 'misc' device.
> * If not, then uninitialize everything.
> */
> if (!node_to_amd_nb(i)->misc) {
>
> And if it's absent, all previously found northbridges are ignored.
Right, it's a bug if part of the Northbridge is missing.
> BTW, kerneldoc seems to suggest pci_dev_put() should be here?
Yes, I think you're right. This should be unwound properly.
Thanks,
Yazen
Powered by blists - more mailing lists