[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251022011610.60d0ba6e.michal.pecio@gmail.com>
Date: Wed, 22 Oct 2025 01:16:10 +0200
From: Michal Pecio <michal.pecio@...il.com>
To: yazen.ghannam@....com
Cc: Shyam-sundar.S-k@....com, bhelgaas@...gle.com, hdegoede@...hat.com,
ilpo.jarvinen@...ux.intel.com, jdelvare@...e.com,
linux-edac@...r.kernel.org, linux-hwmon@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
linux@...ck-us.net, mario.limonciello@....com,
naveenkrishna.chatradhi@....com, platform-driver-x86@...r.kernel.org,
suma.hegde@....com, tony.luck@...el.com, x86@...nel.org
Subject: Re: [PATCH v3 06/12] x86/amd_nb: Use topology info to get AMD node
count
> Currently, the total AMD node count is determined by searching and
> counting CPU/node devices using PCI IDs.
>
> However, AMD node information is already available through topology
> CPUID/MSRs. The recent topology rework has made this info easier to
> access.
>
> Replace the node counting code with a simple product of topology info.
>
> Every node/northbridge is expected to have a 'misc' device. Clear
> everything out if a 'misc' device isn't found on a node.
Hi,
I have a weird/buggy AM3 machine (Asus M4A88TD-M EVO, Phenom 965) where
the kernel believes there are two packages and this assumption fails.
[ 0.072051] CPU topo: Max. logical packages: 2
[ 0.072052] CPU topo: Max. logical dies: 2
[ 0.072052] CPU topo: Max. dies per package: 1
[ 0.072057] CPU topo: Max. threads per core: 1
[ 0.072058] CPU topo: Num. cores per package: 4
[ 0.072059] CPU topo: Num. threads per package: 4
It's currently on v6.12 series and working OK, but I remember trying
v6.15 at one point and finding that EDAC and GART IOMMU were broken
because the NB driver failed to initialize. This fixed it:
--- a/arch/x86/kernel/cpu/topology.c
+++ b/arch/x86/kernel/cpu/topology.c
@@ -496,8 +496,8 @@ void __init topology_init_possible_cpus(void)
total_cpus = allowed;
set_nr_cpu_ids(allowed);
- cnta = domain_weight(TOPO_PKG_DOMAIN);
- cntb = domain_weight(TOPO_DIE_DOMAIN);
+ cnta = 1;
+ cntb = 1;
__max_logical_packages = cnta;
__max_dies_per_package = 1U << (get_count_order(cntb) - get_count_order(cnta));
It was a few weeks ago and the machine is currently back on v6.12,
but I'm almost sure I tracked it down to this exact code:
> + amd_northbridges.num = amd_num_nodes();
> [...]
> + /*
> + * Each Northbridge must have a 'misc' device.
> + * If not, then uninitialize everything.
> + */
> + if (!node_to_amd_nb(i)->misc) {
> + amd_northbridges.num = 0;
> + kfree(nb);
> + return -ENODEV;
> + }
Thanks,
Michal
Powered by blists - more mailing lists