linux-kernel - Re: [PATCH v2 00/16] AMD NB and SMN rework

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250106163104.GA664169@yaz-khff2.amd.com>
Date: Mon, 6 Jan 2025 11:31:04 -0500
From: Yazen Ghannam <yazen.ghannam@....com>
To: Borislav Petkov <bp@...en8.de>
Cc: x86@...nel.org, Tony Luck <tony.luck@...el.com>,
	Mario Limonciello <mario.limonciello@....com>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Jean Delvare <jdelvare@...e.com>,
	Guenter Roeck <linux@...ck-us.net>,
	Clemens Ladisch <clemens@...isch.de>,
	Shyam Sundar S K <Shyam-sundar.S-k@....com>,
	Hans de Goede <hdegoede@...hat.com>,
	Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
	Naveen Krishna Chatradhi <naveenkrishna.chatradhi@....com>,
	Suma Hegde <suma.hegde@....com>, linux-kernel@...r.kernel.org,
	linux-edac@...r.kernel.org, linux-pci@...r.kernel.org,
	linux-hwmon@...r.kernel.org, platform-driver-x86@...r.kernel.org
Subject: Re: [PATCH v2 00/16] AMD NB and SMN rework

On Mon, Jan 06, 2025 at 10:38:45AM -0500, Yazen Ghannam wrote:
> On Fri, Jan 03, 2025 at 10:49:25PM +0100, Borislav Petkov wrote:
> > On Fri, Dec 06, 2024 at 04:11:53PM +0000, Yazen Ghannam wrote:
> > > Hi all,
> > > 
> > > The theme of this set is decoupling the "AMD node" concept from the
> > > legacy northbridge support.
> > > 
> > > Additionally, AMD System Management Network (SMN) access code is
> > > decoupled and expanded too.
> > > 
> > > Patches 1-3 begin reducing the scope of AMD_NB.
> > > 
> > > Patches 4-9 begin moving generic AMD node support out of AMD_NB.
> > > 
> > > Patches 10-13 move SMN support out of AMD_NB and do some refactoring.
> > > 
> > > Patch 14 has HSMP reuse SMN functionality.
> > > 
> > > Patches 15-16 address userspace access to SMN.
> > 
> > So I took the first patch and then booting the first 13 with the intention to
> > queue them while the remaining three are still being discussed, is causing the
> > below in my guest.
> > 
> > .config is attached, I've pushed the branch here too, if you wanna test with
> > it:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=tip-x86-misc
> > 
> > [    0.897060] cirrus 0000:00:01.0: [drm] fb0: cirrusdrmfb frame buffer device
> > [    0.900310] BUG: kernel NULL pointer dereference, address: 00000000000000c4
> > [    0.902551] #PF: supervisor read access in kernel mode
> > [    0.904096] #PF: error_code(0x0000) - not-present page
> > [    0.904268] PGD 0 P4D 0 
> > [    0.904268] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [    0.904268] CPU: 0 UID: 0 PID: 20 Comm: cpuhp/0 Not tainted 6.13.0-rc1+ #1
> > [    0.904268] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2023.11-8 02/21/2024
> > [    0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40
> > [    0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89
> > [    0.904268] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246
> > [    0.904268] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f
> > [    0.904268] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000
> > [    0.904268] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010
> > [    0.904268] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163
> > [    0.904268] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90
> > [    0.904268] FS:  0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000
> > [    0.904268] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    0.904268] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0
> > [    0.904268] Call Trace:
> > [    0.904268]  <TASK>
> > [    0.904268]  ? __die+0x31/0x80
> > [    0.904268]  ? page_fault_oops+0x15d/0x4f0
> > [    0.904268]  ? srso_return_thunk+0x5/0x5f
> > [    0.904268]  ? ttwu_queue_wakelist+0xf7/0x100
> > [    0.904268]  ? exc_page_fault+0x78/0x150
> > [    0.904268]  ? asm_exc_page_fault+0x26/0x30
> > [    0.904268]  ? pci_read_config_dword+0x9/0x40
> > [    0.904268]  ? srso_return_thunk+0x5/0x5f
> > [    0.904268]  amd_init_l3_cache.part.0+0x6a/0x110
> > [    0.904268]  cpuid4_cache_lookup_regs+0xcf/0x2a0
> > [    0.904268]  populate_cache_leaves+0x6f/0x530
> > [    0.904268]  ? srso_return_thunk+0x5/0x5f
> > [    0.904268]  ? dl_server_stop+0x2f/0x40
> > [    0.904268]  ? srso_return_thunk+0x5/0x5f
> > [    0.904268]  detect_cache_attributes+0x97/0x330
> > [    0.904268]  ? __pfx_cacheinfo_cpu_online+0x10/0x10
> > [    0.904268]  cacheinfo_cpu_online+0x22/0x250
> > [    0.904268]  ? srso_return_thunk+0x5/0x5f
> > [    0.904268]  ? __pfx_cacheinfo_cpu_online+0x10/0x10
> > [    0.904268]  cpuhp_invoke_callback+0x10f/0x480
> > [    0.904268]  ? try_to_wake_up+0x23b/0x540
> > [    0.904268]  cpuhp_thread_fun+0xd4/0x160
> > [    0.904268]  smpboot_thread_fn+0xdd/0x1f0
> > [    0.904268]  ? __pfx_smpboot_thread_fn+0x10/0x10
> > [    0.904268]  kthread+0xca/0xf0
> > [    0.904268]  ? __pfx_kthread+0x10/0x10
> > [    0.904268]  ret_from_fork+0x50/0x60
> > [    0.904268]  ? __pfx_kthread+0x10/0x10
> > [    0.904268]  ret_from_fork_asm+0x1a/0x30
> > [    0.904268]  </TASK>
> > [    0.904268] Modules linked in:
> > [    0.904268] CR2: 00000000000000c4
> > [    0.904268] ---[ end trace 0000000000000000 ]---
> > [    0.904268] RIP: 0010:pci_read_config_dword+0x9/0x40
> > [    0.904268] Code: 00 00 e9 8a f9 57 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 <8b> 87 c4 00 00 00 48 89 d1 83 f8 03 74 10 8b 47 38 48 8b 7f 10 89
> > [    0.988792] RSP: 0018:ffffc9000012fcd8 EFLAGS: 00010246
> > [    0.988792] RAX: 0000000000000000 RBX: ffff88800d296640 RCX: 000000000000003f
> > [    0.988792] RDX: ffffc9000012fce4 RSI: 00000000000001c4 RDI: 0000000000000000
> > [    0.988792] RBP: ffffc9000012fd60 R08: 0000000000000040 R09: 0000000000000010
> > [    0.992761] R10: ffff88800daa1eb0 R11: fffffffffff8dc6f R12: 0000000040000163
> > [    0.992761] R13: ffffc9000012fd60 R14: 0000000000000000 R15: ffff88807d62fc90
> > [    0.992761] FS:  0000000000000000(0000) GS:ffff88807d600000(0000) knlGS:0000000000000000
> > [    0.996772] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [    0.996772] CR2: 00000000000000c4 CR3: 0000000002c1a000 CR4: 00000000003506f0
> > [    0.996772] note: cpuhp/0[20] exited with irqs disabled
> > [    1.680874] tsc: Refined TSC clocksource calibration: 3700.028 MHz
> > [    1.683128] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x6aaae08e541, max_idle_ns: 881590514464 ns
> > [    1.688137] clocksource: Switched to clocksource tsc
> > 
> > 
> 
> Can you please share the guest parameters?
> 

I was able to reproduce it. The patch below seems to fix the issue.

There's a comment in the function that this code is not for virtualized
environments. Also, the "L3 in Northbridge" design doesn't apply to Zen
systems.

I'll keep looking at this to get a better understanding. My first
thought is that this was silently handled before, because the AMD_NB
code operated on PCI IDs. And these wouldn't be exposed to guests, so
the northbridge data structures wouldn't be initialized.

Specifically, I think we now have a non-zero number of northbridges,
since using the topology info rather than counting PCI devices.

In any case, I think it's better to have explicit checks.

Thanks,
Yazen

diff --git a/arch/x86/kernel/cpu/cacheinfo.c b/arch/x86/kernel/cpu/cacheinfo.c
index 392d09c936d6..93d993a6a1df 100644
--- a/arch/x86/kernel/cpu/cacheinfo.c
+++ b/arch/x86/kernel/cpu/cacheinfo.c
@@ -595,6 +595,12 @@ static void amd_init_l3_cache(struct _cpuid4_info_regs *this_leaf, int index)
 	if (index < 3)
 		return;
 
+	if (cpu_feature_enabled(X86_FEATURE_HYPERVISOR))
+		return;
+
+	if (cpu_feature_enabled(X86_FEATURE_ZEN))
+		return;
+
 	node = topology_amd_node_id(smp_processor_id());
 	this_leaf->nb = node_to_amd_nb(node);
 	if (this_leaf->nb && !this_leaf->nb->l3_cache.indices)