[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b86dca12-bccc-46b1-8466-998357deae69@amd.com>
Date: Mon, 6 Oct 2025 15:38:59 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, tony.luck@...el.com,
Dave.Martin@....com, james.morse@....com, dave.hansen@...ux.intel.com,
bp@...en8.de
Cc: kas@...nel.org, rick.p.edgecombe@...el.com, linux-kernel@...r.kernel.org,
x86@...nel.org, linux-coco@...ts.linux.dev, kvm@...r.kernel.org
Subject: Re: [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled
in mbm_event mode
Hi Reinette,
On 10/6/25 12:56, Reinette Chatre wrote:
> Hi Babu,
>
> On 9/30/25 1:26 PM, Babu Moger wrote:
>> resctrl features can be enabled or disabled using boot-time kernel
>> parameters. To turn off the memory bandwidth events (mbmtotal and
>> mbmlocal), users need to pass the following parameter to the kernel:
>> "rdt=!mbmtotal,!mbmlocal".
>
> ah, indeed ... although, the intention behind the mbmtotal and mbmlocal kernel
> parameters was to connect them to the actual hardware features identified
> by X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL respectively.
>
>
>> Found that memory bandwidth events (mbmtotal and mbmlocal) cannot be
>> disabled when mbm_event mode is enabled. resctrl_mon_resource_init()
>> unconditionally enables these events without checking if the underlying
>> hardware supports them.
>
> Technically this is correct since if hardware supports ABMC then the
> hardware is no longer required to support X86_FEATURE_CQM_MBM_TOTAL and
> X86_FEATURE_CQM_MBM_LOCAL in order to provide mbm_total_bytes
> and mbm_local_bytes.
>
> I can see how this may be confusing to user space though ...
>
>>
>> Remove the unconditional enablement of MBM features in
>> resctrl_mon_resource_init() to fix the problem. The hardware support
>> verification is already done in get_rdt_mon_resources().
>
> I believe by "hardware support" you mean hardware support for
> X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL. Wouldn't a fix like
> this then require any system that supports ABMC to also support
> X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL to be able to
> support mbm_total_bytes and mbm_local_bytes?
Yes. That is correct. Right now, ABMC and X86_FEATURE_CQM_MBM_TOTAL/
X86_FEATURE_CQM_MBM_LOCAL are kind of tightly coupled. We have not clearly
separated the that.
>
> This problem seems to be similar to the one solved by [1] since
> by supporting ABMC there is no "hardware does not support mbmtotal/mbmlocal"
> but instead there only needs to be a check if the feature has been disabled
> by command line. That is, add a rdt_is_feature_enabled() check to the
> existing "!resctrl_is_mon_event_enabled()" check?
Enable or disable needs to be done at get_rdt_mon_resources(). It needs to
be done early in the initialization before calling domain_add_cpu() where
event data structures (mbm_states aarch_mbm_states) are allocated.
>
> But wait ... I think there may be a bigger problem when considering systems
> that support ABMC but not X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL.
> Shouldn't resctrl prevent such a system from switching to "default"
> mbm_assign_mode? Otherwise resctrl will happily let such a system switch
> to default mode and when user attempts to read an event file resctrl will
> attempt to read it via MSRs that are not supported.
> Looks like ABMC may need something similar to CONFIG_RESCTRL_ASSIGN_FIXED
> to handle this case in show() while preventing user space from switching to
> "default" mode on write()?
This may not be an issue right now. When X86_FEATURE_CQM_MBM_TOTAL and
X86_FEATURE_CQM_MBM_LOCAL are not supported then mon_data files of these
events are not created.
>
> Reinette
>
> [1] https://lore.kernel.org/lkml/20250925200328.64155-23-tony.luck@intel.com/
>
>
>
--
Thanks
Babu Moger
Powered by blists - more mailing lists