lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9bd6b7d1-957b-47ef-8edb-88f9749fcb09@intel.com>
Date: Thu, 6 Feb 2025 10:42:43 -0800
From: Reinette Chatre <reinette.chatre@...el.com>
To: "Moger, Babu" <bmoger@....com>, Babu Moger <babu.moger@....com>,
	<corbet@....net>, <tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>, <tony.luck@...el.com>,
	<peternewman@...gle.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
	<akpm@...ux-foundation.org>, <thuth@...hat.com>, <rostedt@...dmis.org>,
	<xiongwei.song@...driver.com>, <pawan.kumar.gupta@...ux.intel.com>,
	<daniel.sneddon@...ux.intel.com>, <jpoimboe@...nel.org>,
	<perry.yuan@....com>, <sandipan.das@....com>, <kai.huang@...el.com>,
	<xiaoyao.li@...el.com>, <seanjc@...gle.com>, <xin3.li@...el.com>,
	<andrew.cooper3@...rix.com>, <ebiggers@...gle.com>,
	<mario.limonciello@....com>, <james.morse@....com>,
	<tan.shaopeng@...itsu.com>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <maciej.wieczor-retman@...el.com>,
	<eranian@...gle.com>
Subject: Re: [PATCH v11 06/23] x86/resctrl: Add support to enable/disable AMD
 ABMC feature

Hi Babu,

On 2/6/25 8:15 AM, Moger, Babu wrote:
> Hi Reinette,
> 
> On 2/5/2025 4:49 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 1/22/25 12:20 PM, Babu Moger wrote:
>>> Add the functionality to enable/disable AMD ABMC feature.
>>>
>>> AMD ABMC feature is enabled by setting enabled bit(0) in MSR
>>> L3_QOS_EXT_CFG. When the state of ABMC is changed, the MSR needs
>>> to be updated on all the logical processors in the QOS Domain.
>>>
>>> Hardware counters will reset when ABMC state is changed.
>>
>> I find that the state management in this series is organized better
>> and easier to understand. I do think that it can be simplified more
>> and a hint to this is that it is mentioned here but not done in the
>> code introduced here but instead required from the caller. It seems
>> simpler to me that the architectural state can just be reset at the
>> same time as enable/disable of ABMC?
> 
> Right now, it is done from mbm_cntr_reset(). It does both arch and non-arch state reset for all the RMIDs in all the domains. It is called in two places.
> 
> 1 rdtgroup.c resctrl_mbm_assign_mode_write -> mbm_cntr_reset();
Please see my response to this usage in the related patch:
https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
In summary, I find mbm_cntr_reset() ended up being a catch-all for random
cleanup and creates confusion with the other mbm_cntr_*() calls.

> 2 rdtgroup.c rdt_kill_sb()-> mbm_cntr_reset();
Please see my response to this usage in the related patch:
https://lore.kernel.org/lkml/8d04f824-d1cc-461c-9c57-0f26c6aa96e0@intel.com/
In summary, it does not solve the problem it originally set out to solve
and it can be eliminated.

> 
> I will have to introduce another function to reset RMIDs in all the domains. Also, make sure it is called from both these places.
> 
> list_for_each_entry(dom, &r->mon_domains, hdr.list)
>             resctrl_arch_reset_rmid_all(r, dom);

I do not see need for new functions, except the one I mention in 
https://lore.kernel.org/lkml/b60b4f72-6245-46db-a126-428fb13b6310@intel.com/
that suggests a new helper for reset of architectural state that does not
exist and ends up being open coded in two places in this series.

With only one place (resctrl_mbm_assign_mode_write()) remaining that needs
all state reset I think it will be easier to understand if the state reset
is open coded within it, replacing mbm_cntr_reset() with:

	list_for_each_entry(dom, &r->mon_domains, hdr.list) {
		mbm_cntr_free_all()
		resctrl_reset_rmid_all() // Just for architectural state
	}

I would not insist on reset of architectural state within the
architectural helper. I find that it is best for architecture to
maintain its state but I also see there are many precedent for
resctrl explicitly managing the state.

> I feel current code is much more cleaner.  What do you think?

It is better that previous versions, yes.

> 
>>
>>>
>>> The ABMC feature details are documented in APM listed below [1].
>>> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
>>> Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth
>>> Monitoring (ABMC).
>>>
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
>>> Signed-off-by: Babu Moger <babu.moger@....com>
>>> ---
>>
>> ...
>>

...

>>> + */
>>> +static void _resctrl_abmc_enable(struct rdt_resource *r, bool enable)
>>> +{
>>> +    struct rdt_mon_domain *d;
>>> +
>>> +    list_for_each_entry(d, &r->mon_domains, hdr.list)
>>> +        on_each_cpu_mask(&d->hdr.cpu_mask,
>>> +                 resctrl_abmc_set_one_amd, &enable, 1);
>>> +}
>>> +
>>> +int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
>>> +{
>>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>> +
>>> +    if (r->mon.mbm_cntr_assignable &&
>>> +        hw_res->mbm_cntr_assign_enabled != enable) {
>>> +        _resctrl_abmc_enable(r, enable);
>>> +        hw_res->mbm_cntr_assign_enabled = enable;
>>
>> Added benefit of resetting architectural state within this if statement
>> (perhaps simpler to be done within _resctrl_abmc_enable()) is that it will
>> not be done unnecessarily if ABMC is already in requested state.
> 
> It will be
>       list_for_each_entry(dom, &r->mon_domains, hdr.list)
>             resctrl_arch_reset_rmid_all(r, dom);

I am not sure if you are actually planning a new loop here ... as
I suggested above this can be added to _resctrl_abmc_enable() where
there is already a loop over all monitor domains and all that is
needed is to add a call to resctrl_arch_reset_rmid_all(r, dom). 
Even so, as I mentioned above, if after fixing automatic counter
unassignment you still find that resetting architectural and
non-architectural state together then we can go with that to match
the other flows (eg. mbm_config_write_domain()).

Reinette


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ