lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dd3fcfff-4e8b-4a71-b6c3-1e188d61a4a6@intel.com>
Date: Tue, 20 Aug 2024 13:37:03 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: <babu.moger@....com>, <corbet@....net>, <fenghua.yu@...el.com>,
	<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
	<dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
	<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
	<yanjiewtw@...il.com>, <kim.phillips@....com>, <lukas.bulwahn@...il.com>,
	<seanjc@...gle.com>, <jmattson@...gle.com>, <leitao@...ian.org>,
	<jpoimboe@...nel.org>, <rick.p.edgecombe@...el.com>,
	<kirill.shutemov@...ux.intel.com>, <jithu.joseph@...el.com>,
	<kai.huang@...el.com>, <kan.liang@...ux.intel.com>,
	<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
	<sandipan.das@....com>, <ilpo.jarvinen@...ux.intel.com>,
	<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
	<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<eranian@...gle.com>, <james.morse@....com>
Subject: Re: [PATCH v6 20/22] x86/resctrl: Enable AMD ABMC feature by default
 when supported

Hi Babu,

On 8/20/24 1:18 PM, Moger, Babu wrote:
> On 8/20/24 15:04, Moger, Babu wrote:
>> On 8/20/24 13:12, Reinette Chatre wrote:
>>> On 8/19/24 11:18 AM, Moger, Babu wrote:
>>>> On 8/16/24 17:33, Reinette Chatre wrote:
>>>>> On 8/6/24 3:00 PM, Babu Moger wrote:
>>>
>>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> index 66febff2a3d3..d15fd1bde5f4 100644
>>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>>> @@ -2756,6 +2756,23 @@ void resctrl_arch_mbm_cntr_assign_disable(void)
>>>>>>         }
>>>>>>     }
>>>>>>     +void resctrl_arch_mbm_cntr_assign_configure(void)
>>>>>> +{
>>>>>> +    struct rdt_resource *r =
>>>>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>>>>> +    struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>>>>> +    bool enable = true;
>>>>>> +
>>>>>> +    mutex_lock(&rdtgroup_mutex);
>>>>>> +
>>>>>> +    if (r->mon.mbm_cntr_assignable) {
>>>>>> +        if (!hw_res->mbm_cntr_assign_enabled)
>>>>>> +            hw_res->mbm_cntr_assign_enabled = true;
>>>>>> +        resctrl_abmc_set_one_amd(&enable);
>>>>>
>>>>> Earlier changelogs mentioned that counters are reset when ABMC is enabled.
>>>>> How does that behave here when one CPU comes online? Consider the scenario
>>>>> where
>>>>> a system is booted without all CPUs online. ABMC is initially enabled on
>>>>> all online
>>>>> CPUs with this flow ... user space could start using resctrl fs and create
>>>>> monitor groups that start accumulating architectural state. If the
>>>>> remaining
>>>>> CPUs come online at this point and this snippet enables ABMC, would it
>>>>> reset
>>>>> all counters? Should the architectural state be cleared?
>>>>
>>>> When new cpu comes online, it should inherit the abmc state which is set
>>>> already. it should not force it either way. In that case, it is not
>>>> required to reset the architectural state.
>>>>
>>>> Responded to your earlier comment.
>>>> https://lore.kernel.org/lkml/0256b457-175d-4923-aa49-00e8e52b865b@amd.com/
>>>>
>>>>
>>>>>
>>>>> Also, it still does not look right that the architecture decides the
>>>>> policy.
>>>>> Could this enabling be moved to resctrl_online_cpu() for resctrl fs to
>>>>> request architecture to enable assignable counters if it is supported?
>>>>
>>>> Sure. Will move the resctrl_arch_mbm_cntr_assign_configure() here with
>>>> changes just to update the abmc state which is set during the init.
>>>>
>>>
>>> I do not think we are seeing it the same way. In your earlier comment you
>>> mention:
>>>
>>>> We need to set abmc state to "enabled" during the init when abmc is
>>>> detected.  resctrl_late_init -> .. -> rdt_get_mon_l3_config
>>>>
>>>> This only happens once during the init.
>>>
>>>
>>> I do not think that the ABMC state can be set during init since that runs
>>> before the fs code and thus the arch code cannot be aware of the fs policy
>>> that "mbm_assign_mode" is the default. This may become clear when you move
>>> resctrl_arch_mbm_cntr_assign_configure() to resctrl_online_cpu() though
>>> since I expect that the r->mon.mbm_cntr_assignable check will move
>>> into the fs resctrl_online_cpu() that will call the arch helper to
>>> set the state to enabled.
>>
>> There are couple of problems here.
>>
>> 1. Hotplug with ABMC enabled.
>>
>>    System is running with ABMC enabled. Now, new cpu cames online.
>>    The function resctrl_arch_mbm_cntr_assign_configure() will set the MSR
>> MSR_IA32_L3_QOS_EXT_CFG to enable ABMC on the new CPU. This scenario works
>> fine.
>>
>>
>> 2. Hotplug with ABMC disabled.
>>    Current code will force the system to enable ABMC on the new CPU.
>>    That is not correct.
>>
>>
>> We need to address both these cases.
>>
>>
>> I was thinking of separating the functionality in
>> resctrl_arch_mbm_cntr_assign_configure() into two.
>>
>> a. Just set the mbm_cntr_assign_enabled to true during the init.
>>     if (r->mon.mbm_cntr_assignable)
>>         hw_res->mbm_cntr_assign_enabled = true;
>>
>>     This is similar to rdtgroup_setup_default().  Isn't it?

Right. I (mis)understood from your earlier comment that this was planned to
be done from "resctrl_late_init -> .. -> rdt_get_mon_l3_config", but
handling that within rdtgroup_init() (after MPAM it will be resctrl_init())
seem appropriate to me since it is where the fs code can set its policy.

> 
> I just noticed that I cannot access rdt_hw_resource here. I may have to
> write another function resctrl_arch_mbm_cntr_assign_set_default() to do
> this. What do you think?

Sounds good.

> 
> 
>>
>>
>> b. Change the functionality in resctrl_arch_mbm_cntr_assign_configure()
>>     to update the MSR MSR_IA32_L3_QOS_EXT_CFG based on
>> hw_res->mbm_cntr_assign_enabled.  Something like this.
>>
>>
>>     void resctrl_arch_mbm_cntr_assign_configure(void)
>> {
>>      ---
>>      if (r->mon.mbm_cntr_assignable &&  hw_res->mbm_cntr_assign_enabled)
>>              abmc_set_one_amd(&enable);
>>     ---
>> }
>>
>>
>> Yes.  The function resctrl_arch_mbm_cntr_assign_configure() will be called
>> from resctrl_online_cpu().

Checking r->mon.mbm_cntr_assignable may be redundant since I expect that
resctrl_online_cpu() will also do the check. Would do no harm though.

>>
>> Does it make sense?  Any other idea?
> 

Yes, this makes sense to me. Thank you.

Reinette


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ