[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <33c56f32-4e56-47b5-890c-fbf1d45d7213@intel.com>
Date: Thu, 3 Oct 2024 19:17:22 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: <babu.moger@....com>, <corbet@....net>, <fenghua.yu@...el.com>,
<tglx@...utronix.de>, <mingo@...hat.com>, <bp@...en8.de>,
<dave.hansen@...ux.intel.com>
CC: <x86@...nel.org>, <hpa@...or.com>, <paulmck@...nel.org>,
<rdunlap@...radead.org>, <tj@...nel.org>, <peterz@...radead.org>,
<yanjiewtw@...il.com>, <kim.phillips@....com>, <lukas.bulwahn@...il.com>,
<seanjc@...gle.com>, <jmattson@...gle.com>, <leitao@...ian.org>,
<jpoimboe@...nel.org>, <rick.p.edgecombe@...el.com>,
<kirill.shutemov@...ux.intel.com>, <jithu.joseph@...el.com>,
<kai.huang@...el.com>, <kan.liang@...ux.intel.com>,
<daniel.sneddon@...ux.intel.com>, <pbonzini@...hat.com>,
<sandipan.das@....com>, <ilpo.jarvinen@...ux.intel.com>,
<peternewman@...gle.com>, <maciej.wieczor-retman@...el.com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<eranian@...gle.com>, <james.morse@....com>
Subject: Re: [PATCH v7 22/24] x86/resctrl: Update assignments on event
configuration changes
Hi Babu,
On 10/3/24 5:51 PM, Moger, Babu wrote:
> Hi Reinette,
>
> On 10/2/2024 1:20 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 9/27/24 9:22 AM, Moger, Babu wrote:
>>> Hi Reinitte,
>>>
>>> On 9/19/2024 12:45 PM, Reinette Chatre wrote:
>>>> On 9/4/24 3:21 PM, Babu Moger wrote:
>>
>> ...
>>
>>>>> +}
>>>>> +
>>>>> static int rdtgroup_mbm_assign_mode_show(struct kernfs_open_file *of,
>>>>> struct seq_file *s, void *v)
>>>>> {
>>>>> @@ -1793,12 +1802,48 @@ static int mbm_local_bytes_config_show(struct kernfs_open_file *of,
>>>>> return 0;
>>>>> }
>>>>> +static int resctrl_mbm_event_update_assign(struct rdt_resource *r,
>>>>> + struct rdt_mon_domain *d, u32 evtid)
>>>>> +{
>>>>> + struct rdt_mon_domain *dom;
>>>>> + struct rdtgroup *rdtg;
>>>>> + int ret = 0;
>>>>> +
>>>>> + if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>>>>> + return ret;
>>>>> +
>>>>> + list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
>>>>> + struct rdtgroup *crg;
>>>>> +
>>>>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>>>> + if (d == dom && resctrl_mbm_event_assigned(rdtg, dom, evtid)) {
>>>>> + ret = rdtgroup_assign_cntr(r, rdtg, dom, evtid);
>>>>> + if (ret)
>>>>> + goto out_done;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + list_for_each_entry(crg, &rdtg->mon.crdtgrp_list, mon.crdtgrp_list) {
>>>>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>>>> + if (d == dom && resctrl_mbm_event_assigned(crg, dom, evtid)) {
>>>>> + ret = rdtgroup_assign_cntr(r, crg, dom, evtid);
>>>>> + if (ret)
>>>>> + goto out_done;
>>>>> + }
>>>>> + }
>>>>> + }
>>>>> + }
>>>>> +
>>>>> +out_done:
>>>>> + return ret;
>>>>> +}
>>>>> static void mbm_config_write_domain(struct rdt_resource *r,
>>>>> struct rdt_mon_domain *d, u32 evtid, u32 val)
>>>>> {
>>>>> struct mon_config_info mon_info = {0};
>>>>> u32 config_val;
>>>>> + int ret;
>>>>> /*
>>>>> * Check the current config value first. If both are the same then
>>>>> @@ -1822,6 +1867,14 @@ static void mbm_config_write_domain(struct rdt_resource *r,
>>>>> resctrl_arch_event_config_set,
>>>>> &mon_info, 1);
>>>>> + /*
>>>>> + * Counter assignments needs to be updated to match the event
>>>>> + * configuration.
>>>>> + */
>>>>> + ret = resctrl_mbm_event_update_assign(r, d, evtid);
>>>>> + if (ret)
>>>>> + rdt_last_cmd_puts("Assign failed, event will be Unavailable\n");
>>>>> +
>>>>
>>>> This does not look right. This function _just_ returned from an IPI on appropriate CPU and then
>>>> starts flow to do _another_ IPI to run code that could have just been run during previous IPI.
>>>> The whole flow to call rdgroup_assign_cntr() also seems like an obfuscated way to call resctrl_arch_assign_cntr()
>>>> to just reconfigure the counter (not actually assign it).
>>>> Could it perhaps call some resctrl fs code via single IPI that in turn calls the appropriate arch code to set the new
>>>> mon event config and re-configures the counter?
>>>>
>>>
>>> I think we can simplify this. We dont have to go thru all the rdtgroups to figure out if the counter is assigned or not.
>>>
>>> I can move the code inside mon_config_write() after the call mbm_config_write_domain().
>>
>> mbm_config_write_domain() already does an IPI so if I understand correctly this will still
>> result in a second IPI that seems unnecessary to me. Why can the reconfigure not be done
>> with a single IPI?
>
> I think we can try updating the counter configuration in the same IPI. Let me try that.
>
Thank you very much.
>>
>>>
>>> Using the domain bitmap we can figure out which of the counters are assigned in the domain. I can use the hardware help to update the assignment for each counter. This has to be done via IPI.
>>> Something like this.
>>>
>>> static void rdtgroup_abmc_dom_cfg(void *info)
>>> {
>>> union l3_qos_abmc_cfg *abmc_cfg = info;
>>> u32 val = abmc_cfg->bw_type;
>>>
>>> /* Get the counter configuration */
>>> wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *abmc_cfg);
>>> rdmsrl(MGR_IA32_L3_QOS_ABMC_DSC, *abmc_cfg);
>>>
>>
>> This is not clear to me. I expected MSR_IA32_L3_QOS_ABMC_DSC
>> to return the bw_type that was just written to
>> MSR_IA32_L3_QOS_ABMC_CFG.
>>
>> It is also not clear to me how these registers can be
>> used without a valid counter ID. I seem to miss
>> the context of this call.
>
> Event configuration changes are domain specific. We have the domain data structure and we have the bitmap(mbm_cntr_map) in rdt_mon_domain. This bitmap tells us which of the counters in the domain are configured. So, we can get the counter id from this bitmap. Using the counter id, we can query the hardware to get the current configuration by this sequence.
>
> /* Get the counter configuration */
> for (i=0; i< r->mon.num_mbm_cntrs; i++) {
> if (test_bit(i, d->mbm_cntr_map)) {
> abmc_cfg->cntr_id = i;
> abmc_cfg.split.cfg_en = 0;
> wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *abmc_cfg);
>
> /* Reading L3_QOS_ABMC_DSC returns the configuration of the
> * counter id specified in L3_QOS_ABMC_CFG.cntr_id with RMID(bw_src)
> * and event configuration(bw_type) Get the counter configuration
> */
> rdmsrl(MGR_IA32_L3_QOS_ABMC_DSC, *abmc_cfg);
>
Apologies but I do still have the same question as before ... wouldn't
MSR_IA32_L3_QOS_ABMC_DSC return the value that was just written to
MSR_IA32_L3_QOS_ABMC_CFG? If so, the previous wrmsrl() would set the
counter's bw_type to what is set in *abmc_cfg provided as parameter. It
thus still seems unclear why reading it back is necessary.
> /*
> * We know the new bandwidth to be updated.
> * Update the counter by writing to QOS_ABMC_CFG with the new configuration
> */
>
> if (abmc_cfg->bw_type != val) {
> abmc_cfg->bw_type = val;
> abmc_cfg.split.cfg_en = 1;
> wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *abmc_cfg);
> }
> }
> }
>
> Hope this helps. I need to pass few information to IPI to make this work. Let me know if this is not clear. I will code this tomorrow then it will be much more clear.
>
ok, it does seem as though I am not able to follow these snippets and seeing the
full solution should solve that. Thank you.
>
>>
>>> /* update the counter configuration */
>>> if (abmc_cfg->bw_type != val) {
>>> abmc_cfg->bw_type = val;
>>> abmc_cfg.split.cfg_en = 1;
>>> wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *abmc_cfg);
>>> }
>>> }
>>>
Reinette
Powered by blists - more mailing lists