[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <18adb251-b340-4820-a808-e1583b44480a@amd.com>
Date: Fri, 29 Nov 2024 11:06:02 -0600
From: "Moger, Babu" <bmoger@....com>
To: Peter Newman <peternewman@...gle.com>, babu.moger@....com
Cc: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, fenghua.yu@...el.com, x86@...nel.org,
hpa@...or.com, thuth@...hat.com, paulmck@...nel.org, rostedt@...dmis.org,
akpm@...ux-foundation.org, xiongwei.song@...driver.com,
pawan.kumar.gupta@...ux.intel.com, daniel.sneddon@...ux.intel.com,
perry.yuan@....com, sandipan.das@....com, kai.huang@...el.com,
xiaoyao.li@...el.com, seanjc@...gle.com, jithu.joseph@...el.com,
brijesh.singh@....com, xin3.li@...el.com, ebiggers@...gle.com,
andrew.cooper3@...rix.com, mario.limonciello@....com, james.morse@....com,
tan.shaopeng@...itsu.com, tony.luck@...el.com, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, maciej.wieczor-retman@...el.com,
eranian@...gle.com, jpoimboe@...nel.org, thomas.lendacky@....com
Subject: Re: [PATCH v9 14/26] x86/resctrl: Introduce interface to display
number of free counters
Hi Peter, Reinette,
On 11/29/2024 3:59 AM, Peter Newman wrote:
> Hi Babu,
>
> On Thu, Nov 28, 2024 at 8:35 PM Moger, Babu <bmoger@....com> wrote:
>>
>> Hi Peter,
>>
>> On 11/28/2024 5:10 AM, Peter Newman wrote:
>>> Hi Babu, Reinette,
>>>
>>> On Wed, Nov 27, 2024 at 8:05 PM Reinette Chatre
>>> <reinette.chatre@...el.com> wrote:
>>>>
>>>> Hi Babu,
>>>>
>>>> On 11/27/24 6:57 AM, Moger, Babu wrote:
>
>>>>> 1. Each group needs to remember counter ids in each domain for each event.
>>>>> For example:
>>>>> Resctrl group mon1
>>>>> Total event
>>>>> dom 0 cntr_id 1,
>>>>> dom 1 cntr_id 10
>>>>> dom 2 cntr_id 11
>>>>>
>>>>> Local event
>>>>> dom 0 cntr_id 2,
>>>>> dom 1 cntr_id 15
>>>>> dom 2 cntr_id 10
>>>>
>>>> Indeed. The challenge here is that domains may come and go so it cannot be a simple
>>>> static array. As an alternative it can be an xarray indexed by the domain ID with
>>>> pointers to a struct like below to contain the counters associated with the monitor
>>>> group:
>>>> struct cntr_id {
>>>> u32 mbm_total;
>>>> u32 mbm_local;
>>>> }
>>>>
>>>>
>>>> Thinking more about how this array needs to be managed made me wonder how the
>>>> current implementation deals with domains that come and go. I do not think
>>>> this is currently handled. For example, if a new domain comes online and
>>>> monitoring groups had counters dynamically assigned, then these counters are
>>>> not configured to the newly online domain.
>>
>> I am trying to understand the details of your approach here.
>>>
>>> In my prototype, I allocated a counter id-indexed array to each
>>> monitoring domain structure for tracking the counter allocations,
>>> because the hardware counters are all domain-scoped. That way the
>>> tracking data goes away when the hardware does.
>>>
>>> I was focused on allowing all pending counter updates to a domain
>>> resulting from a single mbm_assign_control write to be batched and
>>> processed in a single IPI, so I structured the counter tracker
>>> something like this:
>>
>> Not sure what you meant here. How are you batching two IPIs for two domains?
>>
>> #echo "//0=t;1=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>
>> This is still a single write. Two IPIs are sent separately, one for each
>> domain.
>>
>> Are you doing something different?
>
> I said "all pending counter updates to a domain", whereby I meant
> targeting a single domain.
>
> Depending on the CPU of the caller, your example write requires 1 or 2 IPIs.
>
> What is important is that the following write also requires 1 or 2 IPIs:
>
> (assuming /sys/fs/resctrl/mon_groups/[g1-g31] exist, line breaks added
> for readability)
>
> echo $'//0=t;1=t\n
> /g1/0=t;1=t\n
> /g2/0=t;1=t\n
> /g3/0=t;1=t\n
> /g4/0=t;1=t\n
> /g5/0=t;1=t\n
> /g6/0=t;1=t\n
> /g7/0=t;1=t\n
> /g8/0=t;1=t\n
> /g9/0=t;1=t\n
> /g10/0=t;1=t\n
> /g11/0=t;1=t\n
> /g12/0=t;1=t\n
> /g13/0=t;1=t\n
> /g14/0=t;1=t\n
> /g15/0=t;1=t\n
> /g16/0=t;1=t\n
> /g17/0=t;1=t\n
> /g18/0=t;1=t\n
> /g19/0=t;1=t\n
> /g20/0=t;1=t\n
> /g21/0=t;1=t\n
> /g22/0=t;1=t\n
> /g23/0=t;1=t\n
> /g24/0=t;1=t\n
> /g25/0=t;1=t\n
> /g26/0=t;1=t\n
> /g27/0=t;1=t\n
> /g28/0=t;1=t\n
> /g29/0=t;1=t\n
> /g30/0=t;1=t\n
> /g31/0=t;1=t\n'
>
> My ultimate goal is for a thread bound to a particular domain to be
> able to unassign and reassign the local domain's 32 counters in a
> single write() with no IPIs at all. And when IPIs are required, then
> no more than one per domain, regardless of the number of groups
> updated.
>
Yes. I think I got the idea. Thanks.
>
>>
>>>
>>> struct resctrl_monitor_cfg {
>>> int closid;
>>> int rmid;
>>> int evtid;
>>> bool dirty;
>>> };
>>>
>>> This mirrors the info needed in whatever register configures the
>>> counter, plus a dirty flag to skip over the ones that don't need to be
>>> updated.
>>
>> This is what my understanding of your implementation.
>>
>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>> index d94abba1c716..9cebf065cc97 100644
>> --- a/include/linux/resctrl.h
>> +++ b/include/linux/resctrl.h
>> @@ -94,6 +94,13 @@ struct rdt_ctrl_domain {
>> u32 *mbps_val;
>> };
>>
>> +struct resctrl_monitor_cfg {
>> + int closid;
>> + int rmid;
>> + int evtid;
>> + bool dirty;
>> +};
>> +
>> /**
>> * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor
>> resource
>> * @hdr: common header for different domain types
>> @@ -116,6 +123,7 @@ struct rdt_mon_domain {
>> struct delayed_work cqm_limbo;
>> int mbm_work_cpu;
>> int cqm_work_cpu;
>> + /* Allocate num_mbm_cntrs entries in each domain */
>> + struct resctrl_monitor_cfg *mon_cfg;
>> };
>>
>>
>> When a user requests an assignment for total event to the default group
>> for domain 0, you go search in rdt_mon_domain(dom 0) for empty mon_cfg
>> entry.
>>
>> If there is an empty entry, then use that entry for assignment and
>> update closid, rmid, evtid and dirty = 1. We can get all these
>> information from default group here.
>>
>> Does this make sense?
>
> Yes, sounds correct.
I will probably add cntr_id in resctrl_monitor_cfg structure and
initialize during the allocation. And rename the field 'dirty' to
'active'(or something similar) to hold the assign state for that entry.
That way we have all the information required for assignment at one
place. We don't need to update the rdtgroup structure.
Reinette, What do you think about this approach?
>
>>
>>>
>>> For the benefit of displaying mbm_assign_control, I put a pointer back
>>> to any counter array entry allocated in the mbm_state struct only
>>> because it's an existing structure that exists for every rmid-domain
>>> combination.
>>
>> Pointer in mbm_state may not be required here.
>>
>> We are going to loop over resctrl groups. We can search the
>> rdt_mon_domain to see if specific closid, rmid, evtid is already
>> assigned or not in that domain.
>
> No, not required I guess. High-performance CPUs can probably search a
> 32-entry array very quickly.
Ok.
--
- Babu Moger
Powered by blists - more mailing lists