[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5645dec8-e344-44d3-82f7-327259a53906@intel.com>
Date: Wed, 15 Oct 2025 12:56:15 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: "Moger, Babu" <bmoger@....com>, <babu.moger@....com>,
<tony.luck@...el.com>, <Dave.Martin@....com>, <james.morse@....com>,
<dave.hansen@...ux.intel.com>, <bp@...en8.de>
CC: <kas@...nel.org>, <rick.p.edgecombe@...el.com>,
<linux-kernel@...r.kernel.org>, <x86@...nel.org>,
<linux-coco@...ts.linux.dev>, <kvm@...r.kernel.org>
Subject: Re: [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled
in mbm_event mode
Hi Babu,
On 10/15/25 7:55 AM, Moger, Babu wrote:
> Hi Reinette,
>
> On 10/14/2025 6:09 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 10/14/25 3:45 PM, Moger, Babu wrote:
>>> On 10/14/2025 3:57 PM, Reinette Chatre wrote:
>>>> On 10/14/25 10:43 AM, Babu Moger wrote:
>>
>>
>>>>>> Yes. I saw the issues. It fails to mount in my case with panic trace.
>>>>
>>>> (Just to ensure that there is not anything else going on) Could you please confirm if the panic is from
>>>> mon_add_all_files()->mon_event_read()->mon_event_count()->__mon_event_count()->resctrl_arch_reset_rmid()
>>>> that creates the MBM event files during mount and then does the initial read of RMID to determine the
>>>> starting count?
>>>
>>> It happens just before that (at mbm_cntr_get). We have not allocated d->cntr_cfg for the counters.
>>> ===================Panic trace =================================
>>>
>>> 349.330416] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>> [ 349.338187] #PF: supervisor read access in kernel mode
>>> [ 349.343914] #PF: error_code(0x0000) - not-present page
>>> [ 349.349644] PGD 10419f067 P4D 0
>>> [ 349.353241] Oops: Oops: 0000 [#1] SMP NOPTI
>>> [ 349.357905] CPU: 45 UID: 0 PID: 3449 Comm: mount Not tainted 6.18.0-rc1+ #120 PREEMPT(voluntary)
>>> [ 349.367803] Hardware name: AMD Corporation PURICO/PURICO, BIOS RPUT1003E 12/11/2024
>>> [ 349.376334] RIP: 0010:mbm_cntr_get+0x56/0x90
>>> [ 349.381096] Code: 45 8d 41 fe 83 f8 01 77 3d 8b 7b 50 85 ff 7e 36 49 8b 84 24 f0 04 00 00 45 31 c0 eb 0d 41 83 c0 01 48 83 c0 10 44 39 c7 74 1c <48> 3b 50 08 75 ed 3b 08 75 e9 48 83 c4 10 44 89 c0 5b 41 5c 41 5d
>>> [ 349.402037] RSP: 0018:ff56bba58655f958 EFLAGS: 00010246
>>> [ 349.407861] RAX: 0000000000000000 RBX: ffffffff9525b900 RCX: 0000000000000002
>>> [ 349.415818] RDX: ffffffff95d526a0 RSI: ff1f5d52517c1800 RDI: 0000000000000020
>>> [ 349.423774] RBP: ff56bba58655f980 R08: 0000000000000000 R09: 0000000000000001
>>> [ 349.431730] R10: ff1f5d52c616a6f0 R11: fffc6a2f046c3980 R12: ff1f5d52517c1800
>>> [ 349.439687] R13: 0000000000000001 R14: ffffffff95d526a0 R15: ffffffff9525b968
>>> [ 349.447635] FS: 00007f17926b7800(0000) GS:ff1f5d59d45ff000(0000) knlGS:0000000000000000
>>> [ 349.456659] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [ 349.463064] CR2: 0000000000000008 CR3: 0000000147afe002 CR4: 0000000000771ef0
>>> [ 349.471022] PKRU: 55555554
>>> [ 349.474033] Call Trace:
>>> [ 349.476755] <TASK>
>>> [ 349.479091] ? kernfs_add_one+0x114/0x170
>>> [ 349.483560] rdtgroup_assign_cntr_event+0x9b/0xd0
>>> [ 349.488795] rdtgroup_assign_cntrs+0xab/0xb0
>>> [ 349.493553] rdt_get_tree+0x4be/0x770
>>> [ 349.497623] vfs_get_tree+0x2e/0xf0
>>> [ 349.501508] fc_mount+0x18/0x90
>>> [ 349.505007] path_mount+0x360/0xc50
>>> [ 349.508884] ? putname+0x68/0x80
>>> [ 349.512479] __x64_sys_mount+0x124/0x150
>>> [ 349.516848] x64_sys_call+0x2133/0x2190
>>> [ 349.521123] do_syscall_64+0x74/0x970
>>>
>>> ==================================================================
>>
>> Thank you for capturing this. This is a different trace but it confirms that it is the
>> same root cause. Specifically, event is enabled after the state it depends on is (not) allocated
>> during domain online.
>>
>
> Yes. Thanks
>
> Here is the changelog.
>
> x86,fs/resctrl: Fix BUG with mbm_event mode when MBM events are disabled
>
> The following BUG is encountered when mounting the resctrl filesystem after booting a system with X86_FEATURE_ABMC support and the kernel parameter 'rdt=!mbmtotal,!mbmlocal'.
"booting a system with X86_FEATURE_ABMC" sounds like this is a feature enabled
during boot?
>
> ===========================================================================
> [ 349.330416] BUG: kernel NULL pointer dereference, address: 0000000000000008
> [ 349.338187] #PF: supervisor read access in kernel mode
> [ 349.343914] #PF: error_code(0x0000) - not-present page
> [ 349.349644] PGD 10419f067 P4D 0
> [ 349.353241] Oops: Oops: 0000 [#1] SMP NOPTI
> [ 349.357905] CPU: 45 UID: 0 PID: 3449 Comm: mount Not tainted
> 6.18.0-rc1+ #120 PREEMPT(voluntary)
> [ 349.367803] Hardware name: AMD Corporation
This backtrace needs to be trimmed. See "Backtraces in commit messages" in
Documentation/process/submitting-patches.rst
> [ 349.376334] RIP: 0010:mbm_cntr_get+0x56/0x90
> [ 349.381096] Code: 45 8d 41 fe 83 f8 01 77 3d 8b 7b 50 85 ff 7e 36 49 8b 84 24 f0 04 00 00 45 31 c0 eb 0d 41 83 c0 01 48 83 c0 10 44 39 c7 74 1c <48> 3b 50 08 75 ed 3b 08 75 e9 48 83 c4 10 44 89 c0 5b 41 5c 41 5d
> [ 349.402037] RSP: 0018:ff56bba58655f958 EFLAGS: 00010246
> [ 349.407861] RAX: 0000000000000000 RBX: ffffffff9525b900 RCX: 0000000000000002
> [ 349.415818] RDX: ffffffff95d526a0 RSI: ff1f5d52517c1800 RDI: 0000000000000020
> [ 349.423774] RBP: ff56bba58655f980 R08: 0000000000000000 R09: 0000000000000001
> [ 349.431730] R10: ff1f5d52c616a6f0 R11: fffc6a2f046c3980 R12: ff1f5d52517c1800
> [ 349.439687] R13: 0000000000000001 R14: ffffffff95d526a0 R15: ffffffff9525b968
> [ 349.447635] FS: 00007f17926b7800(0000) GS:ff1f5d59d45ff000(0000)
> knlGS:0000000000000000
> [ 349.456659] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 349.463064] CR2: 0000000000000008 CR3: 0000000147afe002 CR4: 0000000000771ef0
> [ 349.471022] PKRU: 55555554
> [ 349.474033] Call Trace:
> [ 349.476755] <TASK>
> [ 349.479091] ? kernfs_add_one+0x114/0x170
> [ 349.483560] rdtgroup_assign_cntr_event+0x9b/0xd0
> [ 349.488795] rdtgroup_assign_cntrs+0xab/0xb0
> [ 349.493553] rdt_get_tree+0x4be/0x770
> [ 349.497623] vfs_get_tree+0x2e/0xf0
> [ 349.501508] fc_mount+0x18/0x90
> [ 349.505007] path_mount+0x360/0xc50
> [ 349.508884] ? putname+0x68/0x80
> [ 349.512479] __x64_sys_mount+0x124/0x150
>
> When mbm_event mode is enabled, it implicitly enables both MBM total and
> local events. However, specifying the kernel parameter
> "rdt=!mbmtotal,!mbmlocal" disables these events during resctrl initialization. As a result, related data structures, such as rdt_mon_domain::mbm_states, cntr_cfg, and rdt_hw_mon_domain::arch_mbm_states are not allocated. This
This may be a bit confusing with the jumps from "enabled" to "disabled" without noting the
contexts (arch vs fs, early init vs late init).
> leads to a BUG when the user attempts to mount the resctrl filesystem,
> which tries to access these un-allocated structures.
>
>
> Fix the issue by adding a dependency on X86_FEATURE_CQM_MBM_TOTAL and
> X86_FEATURE_CQM_MBM_LOCAL for X86_FEATURE_ABMC to be enabled. This is
> acceptable for now, as X86_FEATURE_ABMC currently implies support for MBM total and local events. However, this dependency should be revisited and removed in the future to decouple feature handling more cleanly.
If I understand correctly the fix for the NULL pointer access is to remove
the late event enabling from resctrl fs. The new dependency fixes a related but different
issue that limits the scenarios in which mbm_event mode is enabled and when it may be possible
to switch between modes.
I think the changelog can be made more specific with some adjustments. Here is an attempt
at doing so but I think it can still be improved for flow.
x86,fs/resctrl: Fix NULL pointer dereference when events force disabled while in mbm_event mode
The following NULL pointer dereference is encountered on mount of resctrl fs after booting
a system that support assignable counters with the "rdt=!mbmtotal,!mbmlocal" kernel parameters:
BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
RIP: 0010:mbm_cntr_get
Call Trace:
rdtgroup_assign_cntr_event
rdtgroup_assign_cntrs
rdt_get_tree
Specifying the kernel parameter "rdt=!mbmtotal,!mbmlocal" effectively disables the legacy
X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features and thus the MBM events
they represent. This results in the per-domain MBM event related data structures to not
be allocated during resctrl early initialization.
resctrl fs initialization follows by implicitly enabling both MBM total and local
events on a system that supports assignable counters (mbm_event mode), but this enabling
occurs after the per-domain data structures have been created.
During runtime resctrl fs assumes that an enabled event can access all its state.
This results in NULL pointer dereference when resctrl attempts to access the
un-allocated structures of an enabled event.
Remove the late MBM event enabling from resctrl fs.
This leaves a problem where the X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL
features may be disabled while assignable counter (mbm_event) mode is enabled without
any events to support. Switching between the "default" and "mbm_event" mode without
any events is not practical.
Create a dependency between the X86_FEATURE_CQM_MBM_TOTAL/X86_FEATURE_CQM_MBM_LOCAL
and X86_FEATURE_ABMC (assignable counter) hardware features. An x86 system that supports
assignable counters now requires support of X86_FEATURE_CQM_MBM_TOTAL or X86_FEATURE_CQM_MBM_LOCAL.
This ensures all needed MBM related data structures are created before use and that it is
only possible to switch between "default" and "mbm_event" mode when the same events are
available in both modes. This dependency does not exist in the hardware but this usage of
these feature settings work for known systems.
>
> Fixes: 13390861b426e ("x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature details")
> Co-developed-by: Reinette Chatre <reinette.chatre@...el.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@...el.com>
> Signed-off-by: Babu Moger <babu.moger@....com>
>
Reinette
Powered by blists - more mailing lists