lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251022133901.GB7243@yaz-khff2.amd.com>
Date: Wed, 22 Oct 2025 09:39:01 -0400
From: Yazen Ghannam <yazen.ghannam@....com>
To: Michal Pecio <michal.pecio@...il.com>
Cc: Shyam-sundar.S-k@....com, bhelgaas@...gle.com, hdegoede@...hat.com,
	ilpo.jarvinen@...ux.intel.com, jdelvare@...e.com,
	linux-edac@...r.kernel.org, linux-hwmon@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
	linux@...ck-us.net, mario.limonciello@....com,
	naveenkrishna.chatradhi@....com,
	platform-driver-x86@...r.kernel.org, suma.hegde@....com,
	tony.luck@...el.com, x86@...nel.org
Subject: Re: [PATCH v3 06/12] x86/amd_nb: Use topology info to get AMD node
 count

On Wed, Oct 22, 2025 at 01:16:10AM +0200, Michal Pecio wrote:
> > Currently, the total AMD node count is determined by searching and
> > counting CPU/node devices using PCI IDs.
> > 
> > However, AMD node information is already available through topology
> > CPUID/MSRs. The recent topology rework has made this info easier to
> > access.
> > 
> > Replace the node counting code with a simple product of topology info.
> > 
> > Every node/northbridge is expected to have a 'misc' device. Clear
> > everything out if a 'misc' device isn't found on a node.
> 
> Hi,
> 
> I have a weird/buggy AM3 machine (Asus M4A88TD-M EVO, Phenom 965) where
> the kernel believes there are two packages and this assumption fails.
> 
> [    0.072051] CPU topo: Max. logical packages:   2
> [    0.072052] CPU topo: Max. logical dies:       2
> [    0.072052] CPU topo: Max. dies per package:   1
> [    0.072057] CPU topo: Max. threads per core:   1
> [    0.072058] CPU topo: Num. cores per package:     4
> [    0.072059] CPU topo: Num. threads per package:   4
> 
> It's currently on v6.12 series and working OK, but I remember trying
> v6.15 at one point and finding that EDAC and GART IOMMU were broken
> because the NB driver failed to initialize. This fixed it:
> 
> --- a/arch/x86/kernel/cpu/topology.c
> +++ b/arch/x86/kernel/cpu/topology.c
> @@ -496,8 +496,8 @@ void __init topology_init_possible_cpus(void)
>         total_cpus = allowed;
>         set_nr_cpu_ids(allowed);
>  
> -       cnta = domain_weight(TOPO_PKG_DOMAIN);
> -       cntb = domain_weight(TOPO_DIE_DOMAIN);
> +       cnta = 1;
> +       cntb = 1;
>         __max_logical_packages = cnta;
>         __max_dies_per_package = 1U << (get_count_order(cntb) - get_count_order(cnta));
> 
> It was a few weeks ago and the machine is currently back on v6.12,
> but I'm almost sure I tracked it down to this exact code:
> 
> > +	amd_northbridges.num = amd_num_nodes();
> > [...]
> > +		/*
> > +		 * Each Northbridge must have a 'misc' device.
> > +		 * If not, then uninitialize everything.
> > +		 */
> > +		if (!node_to_amd_nb(i)->misc) {
> > +			amd_northbridges.num = 0;
> > +			kfree(nb);
> > +			return -ENODEV;
> > +		}
> 

Hi Michal,

Thanks for reporting this.

Can you please share the full output from dmesg and lspci?

Also, can you please share the raw CPUID output (cpuid -r)?

Thanks,
Yazen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ