[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7e71b969-e0e4-51c7-da84-ab111e9be9e3@amd.com>
Date: Mon, 7 Oct 2024 19:01:14 -0500
From: "Moger, Babu" <bmoger@....com>
To: Reinette Chatre <reinette.chatre@...el.com>, babu.moger@....com,
corbet@....net, fenghua.yu@...el.com, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com
Cc: x86@...nel.org, hpa@...or.com, paulmck@...nel.org, rdunlap@...radead.org,
tj@...nel.org, peterz@...radead.org, yanjiewtw@...il.com,
kim.phillips@....com, lukas.bulwahn@...il.com, seanjc@...gle.com,
jmattson@...gle.com, leitao@...ian.org, jpoimboe@...nel.org,
rick.p.edgecombe@...el.com, kirill.shutemov@...ux.intel.com,
jithu.joseph@...el.com, kai.huang@...el.com, kan.liang@...ux.intel.com,
daniel.sneddon@...ux.intel.com, pbonzini@...hat.com, sandipan.das@....com,
ilpo.jarvinen@...ux.intel.com, peternewman@...gle.com,
maciej.wieczor-retman@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, eranian@...gle.com, james.morse@....com
Subject: Re: [PATCH v7 22/24] x86/resctrl: Update assignments on event
configuration changes
Hi Reinette,
On 10/4/2024 10:02 AM, Moger, Babu wrote:
> Hi Reinette,
>
> On 10/3/2024 9:17 PM, Reinette Chatre wrote:
>> Hi Babu,
>>
>> On 10/3/24 5:51 PM, Moger, Babu wrote:
>>> Hi Reinette,
>>>
>>> On 10/2/2024 1:20 PM, Reinette Chatre wrote:
>>>> Hi Babu,
>>>>
>>>> On 9/27/24 9:22 AM, Moger, Babu wrote:
>>>>> Hi Reinitte,
>>>>>
>>>>> On 9/19/2024 12:45 PM, Reinette Chatre wrote:
>>>>>> On 9/4/24 3:21 PM, Babu Moger wrote:
>>>>
>>>> ...
>>>>
>>>>>>> +}
>>>>>>> +
>>>>>>> static int rdtgroup_mbm_assign_mode_show(struct
>>>>>>> kernfs_open_file *of,
>>>>>>> struct seq_file *s, void *v)
>>>>>>> {
>>>>>>> @@ -1793,12 +1802,48 @@ static int
>>>>>>> mbm_local_bytes_config_show(struct kernfs_open_file *of,
>>>>>>> return 0;
>>>>>>> }
>>>>>>> +static int resctrl_mbm_event_update_assign(struct
>>>>>>> rdt_resource *r,
>>>>>>> + struct rdt_mon_domain *d, u32 evtid)
>>>>>>> +{
>>>>>>> + struct rdt_mon_domain *dom;
>>>>>>> + struct rdtgroup *rdtg;
>>>>>>> + int ret = 0;
>>>>>>> +
>>>>>>> + if (!resctrl_arch_mbm_cntr_assign_enabled(r))
>>>>>>> + return ret;
>>>>>>> +
>>>>>>> + list_for_each_entry(rdtg, &rdt_all_groups, rdtgroup_list) {
>>>>>>> + struct rdtgroup *crg;
>>>>>>> +
>>>>>>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>>>>>> + if (d == dom && resctrl_mbm_event_assigned(rdtg,
>>>>>>> dom, evtid)) {
>>>>>>> + ret = rdtgroup_assign_cntr(r, rdtg, dom, evtid);
>>>>>>> + if (ret)
>>>>>>> + goto out_done;
>>>>>>> + }
>>>>>>> + }
>>>>>>> +
>>>>>>> + list_for_each_entry(crg, &rdtg->mon.crdtgrp_list,
>>>>>>> mon.crdtgrp_list) {
>>>>>>> + list_for_each_entry(dom, &r->mon_domains, hdr.list) {
>>>>>>> + if (d == dom && resctrl_mbm_event_assigned(crg,
>>>>>>> dom, evtid)) {
>>>>>>> + ret = rdtgroup_assign_cntr(r, crg, dom, evtid);
>>>>>>> + if (ret)
>>>>>>> + goto out_done;
>>>>>>> + }
>>>>>>> + }
>>>>>>> + }
>>>>>>> + }
>>>>>>> +
>>>>>>> +out_done:
>>>>>>> + return ret;
>>>>>>> +}
>>>>>>> static void mbm_config_write_domain(struct rdt_resource *r,
>>>>>>> struct rdt_mon_domain *d, u32 evtid, u32
>>>>>>> val)
>>>>>>> {
>>>>>>> struct mon_config_info mon_info = {0};
>>>>>>> u32 config_val;
>>>>>>> + int ret;
>>>>>>> /*
>>>>>>> * Check the current config value first. If both are the
>>>>>>> same then
>>>>>>> @@ -1822,6 +1867,14 @@ static void mbm_config_write_domain(struct
>>>>>>> rdt_resource *r,
>>>>>>> resctrl_arch_event_config_set,
>>>>>>> &mon_info, 1);
>>>>>>> + /*
>>>>>>> + * Counter assignments needs to be updated to match the event
>>>>>>> + * configuration.
>>>>>>> + */
>>>>>>> + ret = resctrl_mbm_event_update_assign(r, d, evtid);
>>>>>>> + if (ret)
>>>>>>> + rdt_last_cmd_puts("Assign failed, event will be
>>>>>>> Unavailable\n");
>>>>>>> +
>>>>>>
>>>>>> This does not look right. This function _just_ returned from an
>>>>>> IPI on appropriate CPU and then
>>>>>> starts flow to do _another_ IPI to run code that could have just
>>>>>> been run during previous IPI.
>>>>>> The whole flow to call rdgroup_assign_cntr() also seems like an
>>>>>> obfuscated way to call resctrl_arch_assign_cntr()
>>>>>> to just reconfigure the counter (not actually assign it).
>>>>>> Could it perhaps call some resctrl fs code via single IPI that in
>>>>>> turn calls the appropriate arch code to set the new
>>>>>> mon event config and re-configures the counter?
>>>>>>
>>>>>
>>>>> I think we can simplify this. We dont have to go thru all the
>>>>> rdtgroups to figure out if the counter is assigned or not.
>>>>>
>>>>> I can move the code inside mon_config_write() after the call
>>>>> mbm_config_write_domain().
>>>>
>>>> mbm_config_write_domain() already does an IPI so if I understand
>>>> correctly this will still
>>>> result in a second IPI that seems unnecessary to me. Why can the
>>>> reconfigure not be done
>>>> with a single IPI?
>>>
>>> I think we can try updating the counter configuration in the same
>>> IPI. Let me try that.
>>>
>>
>> Thank you very much.
>>
>>>>
>>>>>
>>>>> Using the domain bitmap we can figure out which of the counters are
>>>>> assigned in the domain. I can use the hardware help to update the
>>>>> assignment for each counter. This has to be done via IPI.
>>>>> Something like this.
>>>>>
>>>>> static void rdtgroup_abmc_dom_cfg(void *info)
>>>>> {
>>>>> union l3_qos_abmc_cfg *abmc_cfg = info;
>>>>> u32 val = abmc_cfg->bw_type;
>>>>>
>>>>> /* Get the counter configuration */
>>>>> wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *abmc_cfg);
>>>>> rdmsrl(MGR_IA32_L3_QOS_ABMC_DSC, *abmc_cfg);
>>>>>
>>>>
>>>> This is not clear to me. I expected MSR_IA32_L3_QOS_ABMC_DSC
>>>> to return the bw_type that was just written to
>>>> MSR_IA32_L3_QOS_ABMC_CFG.
>>>>
>>>> It is also not clear to me how these registers can be
>>>> used without a valid counter ID. I seem to miss
>>>> the context of this call.
>>>
>>> Event configuration changes are domain specific. We have the domain
>>> data structure and we have the bitmap(mbm_cntr_map) in
>>> rdt_mon_domain. This bitmap tells us which of the counters in the
>>> domain are configured. So, we can get the counter id from this
>>> bitmap. Using the counter id, we can query the hardware to get the
>>> current configuration by this sequence.
>>>
>>> /* Get the counter configuration */
>>> for (i=0; i< r->mon.num_mbm_cntrs; i++) {
>>> if (test_bit(i, d->mbm_cntr_map)) {
>>> abmc_cfg->cntr_id = i;
>>> abmc_cfg.split.cfg_en = 0;
>>> wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *abmc_cfg);
>>>
>>> /* Reading L3_QOS_ABMC_DSC returns the configuration of the
>>> * counter id specified in L3_QOS_ABMC_CFG.cntr_id with RMID(bw_src)
>>> * and event configuration(bw_type) Get the counter configuration
>>> */
>>> rdmsrl(MGR_IA32_L3_QOS_ABMC_DSC, *abmc_cfg);
>>>
>>
>> Apologies but I do still have the same question as before ... wouldn't
>> MSR_IA32_L3_QOS_ABMC_DSC return the value that was just written to
>> MSR_IA32_L3_QOS_ABMC_CFG? If so, the previous wrmsrl() would set the
>> counter's bw_type to what is set in *abmc_cfg provided as parameter. It
>> thus still seems unclear why reading it back is necessary.
>
> Yes. It is not clear in the spec.
>
> QOS_ABMC_DSC is read-only MSR and used only to get the configured
> counter id information.
>
> The configuration is only updated when QOS_ABMC_CFG.cfg_en = 1.
>
> When you write QOS_ABMC_CFG with cntr_id = n, abmc_cfg.split.cfg_en = 0
> and reading the QOS_ABMC_DSC back will return the configuration of
> cntr_id. Note that when abmc_cfg.split.cfg_en = 0, it will not change
> the counter id configuration. when you read QOS_ABMC_DSC back here, we
> will get bw_type (event config), bw_src (RMID) etc.
>
> union l3_qos_abmc_cfg {
> struct {
> unsigned long bw_type :32,
> bw_src :12,
> reserved1: 3,
> is_clos : 1,
> cntr_id : 5,
> reserved : 9,
> cntr_en : 1,
> cfg_en : 1;
> } split;
> unsigned long full;
> };
>
> We need to update bw_tyoe (event config). When we write QOS_ABMC_CFG
> back with abmc_cfg.split.cfg_en = 1, the configuration will be updated.
>
> if (abmc_cfg->bw_type != val) {
> abmc_cfg->bw_type = val;
> abmc_cfg.split.cfg_en = 1;
> wrmsrl(MSR_IA32_L3_QOS_ABMC_CFG, *abmc_cfg);
> }
>
> I will send you the code later today.
>
Found out that we cannot do the way we disussed above.
Event update can be either local event or total event.
We need to update the counters that are only assigned to event
type(total or local). That information is not avilable in the domain or
by quering the hardware. Need to search in resctrl groups for that
information.
Updated the patch for that. All the update is done in the same IPI.
Will send the series later this week.
Thanks
Babu
Powered by blists - more mailing lists