[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251104152351.GB452990@yaz-khff2.amd.com>
Date: Tue, 4 Nov 2025 10:23:51 -0500
From: Yazen Ghannam <yazen.ghannam@....com>
To: Michal Pecio <michal.pecio@...il.com>
Cc: x86@...nel.org, regressions@...ts.linux.dev,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Mario Limonciello <mario.limonciello@....com>,
Eric DeVolder <eric.devolder@...cle.com>,
linux-kernel@...r.kernel.org
Subject: Re: AMD topology broken on various 754/AM2+/AM3/AM3+ systems causes
NB/EDAC/GART regression since 6.14
On Mon, Nov 03, 2025 at 06:12:45PM +0100, Michal Pecio wrote:
> On Mon, 3 Nov 2025 09:38:51 -0500, Yazen Ghannam wrote:
> > > I have this AM4 system with some proprietary HP BIOS:
> > >
> > > [02Fh 0047 001h] Local Apic ID : 10
> > > [037h 0055 001h] Local Apic ID : 11
> > > [03Fh 0063 001h] Local Apic ID : 12
> > > [047h 0071 001h] Local Apic ID : 13
> > >
> > > domain: Thread shift: 0 dom_size: 1 max_threads: 1
> > > domain: Core shift: 4 dom_size: 16 max_threads: 16
> > > domain: Module shift: 4 dom_size: 1 max_threads: 16
> > > domain: Tile shift: 4 dom_size: 1 max_threads: 16
> > > domain: Die shift: 4 dom_size: 1 max_threads: 16
> > > domain: DieGrp shift: 4 dom_size: 1 max_threads: 16
> > > domain: Package shift: 4 dom_size: 1 max_threads: 16
> > >
> > > It seems that pkgid is 0x1 here, which is not a problem because
> > > it's single socket, but dunno if HP or somebody else couldn't do
> > > similar things in an 8-socket system and end up with pkgid > 8.
> > >
> >
> > So is this another bogus case?
>
> No, it isn't bogus. It's a quad-core Carrizo APU with its four LAPICs,
> but their numbers start from 0x10 rather than 0x00. And AFAIU, the
> calculated pkgid vaule will be 1.
>
Oh I see. That's interesting.
Though something still feels off about it, because these systems never
had 16 threads/cores.
> If HP or other vendor would do similar thing on an 8-socket system,
> the assumption that (pkgid < 8) could no longer hold, even if the CPUs
> are completely real.
The proposed check only applies to older systems. Were there any
8-socket AMD systems in production? I'm only aware of systems up to 4
sockets from those days. Hence the arbitrary value check.
Now there are systems with 8 "AMD nodes". However, before Zen these
would be 2 AMD nodes per socket. With 4 sockets you'd get 8 AMD nodes.
Naples had 4 AMD nodes per socket, but the max system had 2 sockets.
But I think I see your point that this check would break if the pkgid
values can start from an arbitrary number. If such a case is reported,
then this could be a new quirk or rework of the existing quirks.
Thanks,
Yazen
Powered by blists - more mailing lists