[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5163ce35-f843-41a3-abfc-5af91b7c68bc@intel.com>
Date: Tue, 14 Oct 2025 16:09:43 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: "Moger, Babu" <bmoger@....com>, <babu.moger@....com>,
<tony.luck@...el.com>, <Dave.Martin@....com>, <james.morse@....com>,
<dave.hansen@...ux.intel.com>, <bp@...en8.de>
CC: <kas@...nel.org>, <rick.p.edgecombe@...el.com>,
<linux-kernel@...r.kernel.org>, <x86@...nel.org>,
<linux-coco@...ts.linux.dev>, <kvm@...r.kernel.org>
Subject: Re: [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled
in mbm_event mode
Hi Babu,
On 10/14/25 3:45 PM, Moger, Babu wrote:
> On 10/14/2025 3:57 PM, Reinette Chatre wrote:
>> On 10/14/25 10:43 AM, Babu Moger wrote:
>>>> Yes. I saw the issues. It fails to mount in my case with panic trace.
>>
>> (Just to ensure that there is not anything else going on) Could you please confirm if the panic is from
>> mon_add_all_files()->mon_event_read()->mon_event_count()->__mon_event_count()->resctrl_arch_reset_rmid()
>> that creates the MBM event files during mount and then does the initial read of RMID to determine the
>> starting count?
>
> It happens just before that (at mbm_cntr_get). We have not allocated d->cntr_cfg for the counters.
> ===================Panic trace =================================
>
> 349.330416] BUG: kernel NULL pointer dereference, address: 0000000000000008
> [ 349.338187] #PF: supervisor read access in kernel mode
> [ 349.343914] #PF: error_code(0x0000) - not-present page
> [ 349.349644] PGD 10419f067 P4D 0
> [ 349.353241] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 349.357905] CPU: 45 UID: 0 PID: 3449 Comm: mount Not tainted 6.18.0-rc1+ #120 PREEMPT(voluntary)
> [ 349.367803] Hardware name: AMD Corporation PURICO/PURICO, BIOS RPUT1003E 12/11/2024
> [ 349.376334] RIP: 0010:mbm_cntr_get+0x56/0x90
> [ 349.381096] Code: 45 8d 41 fe 83 f8 01 77 3d 8b 7b 50 85 ff 7e 36 49 8b 84 24 f0 04 00 00 45 31 c0 eb 0d 41 83 c0 01 48 83 c0 10 44 39 c7 74 1c <48> 3b 50 08 75 ed 3b 08 75 e9 48 83 c4 10 44 89 c0 5b 41 5c 41 5d
> [ 349.402037] RSP: 0018:ff56bba58655f958 EFLAGS: 00010246
> [ 349.407861] RAX: 0000000000000000 RBX: ffffffff9525b900 RCX: 0000000000000002
> [ 349.415818] RDX: ffffffff95d526a0 RSI: ff1f5d52517c1800 RDI: 0000000000000020
> [ 349.423774] RBP: ff56bba58655f980 R08: 0000000000000000 R09: 0000000000000001
> [ 349.431730] R10: ff1f5d52c616a6f0 R11: fffc6a2f046c3980 R12: ff1f5d52517c1800
> [ 349.439687] R13: 0000000000000001 R14: ffffffff95d526a0 R15: ffffffff9525b968
> [ 349.447635] FS: 00007f17926b7800(0000) GS:ff1f5d59d45ff000(0000) knlGS:0000000000000000
> [ 349.456659] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 349.463064] CR2: 0000000000000008 CR3: 0000000147afe002 CR4: 0000000000771ef0
> [ 349.471022] PKRU: 55555554
> [ 349.474033] Call Trace:
> [ 349.476755] <TASK>
> [ 349.479091] ? kernfs_add_one+0x114/0x170
> [ 349.483560] rdtgroup_assign_cntr_event+0x9b/0xd0
> [ 349.488795] rdtgroup_assign_cntrs+0xab/0xb0
> [ 349.493553] rdt_get_tree+0x4be/0x770
> [ 349.497623] vfs_get_tree+0x2e/0xf0
> [ 349.501508] fc_mount+0x18/0x90
> [ 349.505007] path_mount+0x360/0xc50
> [ 349.508884] ? putname+0x68/0x80
> [ 349.512479] __x64_sys_mount+0x124/0x150
> [ 349.516848] x64_sys_call+0x2133/0x2190
> [ 349.521123] do_syscall_64+0x74/0x970
>
> ==================================================================
Thank you for capturing this. This is a different trace but it confirms that it is the
same root cause. Specifically, event is enabled after the state it depends on is (not) allocated
during domain online.
Reinette
Powered by blists - more mailing lists