[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f36f5459-cf5a-4401-8a00-cca83c293fdc@amd.com>
Date: Tue, 20 Aug 2024 15:18:15 -0500
From: "Moger, Babu" <babu.moger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v6 20/22] x86/resctrl: Enable AMD ABMC feature by default
when supported
On 8/20/24 15:04, Moger, Babu wrote:
> Hi Reinette,
>
> On 8/20/24 13:12, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 8/19/24 11:18 AM, Moger, Babu wrote:
>>> On 8/16/24 17:33, Reinette Chatre wrote:
>>>> On 8/6/24 3:00 PM, Babu Moger wrote:
>>
>>>>> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> index 66febff2a3d3..d15fd1bde5f4 100644
>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
>>>>> @@ -2756,6 +2756,23 @@ void resctrl_arch_mbm_cntr_assign_disable(void)
>>>>> }
>>>>> }
>>>>> +void resctrl_arch_mbm_cntr_assign_configure(void)
>>>>> +{
>>>>> + struct rdt_resource *r =
>>>>> &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl;
>>>>> + struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
>>>>> + bool enable = true;
>>>>> +
>>>>> + mutex_lock(&rdtgroup_mutex);
>>>>> +
>>>>> + if (r->mon.mbm_cntr_assignable) {
>>>>> + if (!hw_res->mbm_cntr_assign_enabled)
>>>>> + hw_res->mbm_cntr_assign_enabled = true;
>>>>> + resctrl_abmc_set_one_amd(&enable);
>>>>
>>>> Earlier changelogs mentioned that counters are reset when ABMC is enabled.
>>>> How does that behave here when one CPU comes online? Consider the scenario
>>>> where
>>>> a system is booted without all CPUs online. ABMC is initially enabled on
>>>> all online
>>>> CPUs with this flow ... user space could start using resctrl fs and create
>>>> monitor groups that start accumulating architectural state. If the
>>>> remaining
>>>> CPUs come online at this point and this snippet enables ABMC, would it
>>>> reset
>>>> all counters? Should the architectural state be cleared?
>>>
>>> When new cpu comes online, it should inherit the abmc state which is set
>>> already. it should not force it either way. In that case, it is not
>>> required to reset the architectural state.
>>>
>>> Responded to your earlier comment.
>>> https://lore.kernel.org/lkml/0256b457-175d-4923-aa49-00e8e52b865b@amd.com/
>>>
>>>
>>>>
>>>> Also, it still does not look right that the architecture decides the
>>>> policy.
>>>> Could this enabling be moved to resctrl_online_cpu() for resctrl fs to
>>>> request architecture to enable assignable counters if it is supported?
>>>
>>> Sure. Will move the resctrl_arch_mbm_cntr_assign_configure() here with
>>> changes just to update the abmc state which is set during the init.
>>>
>>
>> I do not think we are seeing it the same way. In your earlier comment you
>> mention:
>>
>>> We need to set abmc state to "enabled" during the init when abmc is
>>> detected. resctrl_late_init -> .. -> rdt_get_mon_l3_config
>>>
>>> This only happens once during the init.
>>
>>
>> I do not think that the ABMC state can be set during init since that runs
>> before the fs code and thus the arch code cannot be aware of the fs policy
>> that "mbm_assign_mode" is the default. This may become clear when you move
>> resctrl_arch_mbm_cntr_assign_configure() to resctrl_online_cpu() though
>> since I expect that the r->mon.mbm_cntr_assignable check will move
>> into the fs resctrl_online_cpu() that will call the arch helper to
>> set the state to enabled.
>
> There are couple of problems here.
>
> 1. Hotplug with ABMC enabled.
>
> System is running with ABMC enabled. Now, new cpu cames online.
> The function resctrl_arch_mbm_cntr_assign_configure() will set the MSR
> MSR_IA32_L3_QOS_EXT_CFG to enable ABMC on the new CPU. This scenario works
> fine.
>
>
> 2. Hotplug with ABMC disabled.
> Current code will force the system to enable ABMC on the new CPU.
> That is not correct.
>
>
> We need to address both these cases.
>
>
> I was thinking of separating the functionality in
> resctrl_arch_mbm_cntr_assign_configure() into two.
>
> a. Just set the mbm_cntr_assign_enabled to true during the init.
> if (r->mon.mbm_cntr_assignable)
> hw_res->mbm_cntr_assign_enabled = true;
>
> This is similar to rdtgroup_setup_default(). Isn't it?
I just noticed that I cannot access rdt_hw_resource here. I may have to
write another function resctrl_arch_mbm_cntr_assign_set_default() to do
this. What do you think?
>
>
> b. Change the functionality in resctrl_arch_mbm_cntr_assign_configure()
> to update the MSR MSR_IA32_L3_QOS_EXT_CFG based on
> hw_res->mbm_cntr_assign_enabled. Something like this.
>
>
> void resctrl_arch_mbm_cntr_assign_configure(void)
> {
> ---
> if (r->mon.mbm_cntr_assignable && hw_res->mbm_cntr_assign_enabled)
> abmc_set_one_amd(&enable);
> ---
> }
>
>
> Yes. The function resctrl_arch_mbm_cntr_assign_configure() will be called
> from resctrl_online_cpu().
>
> Does it make sense? Any other idea?
--
Thanks
Babu Moger
Powered by blists - more mailing lists