lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <31b0966e-a795-488b-8277-b5bc0f3558d0@amd.com>
Date: Mon, 2 Dec 2024 09:02:43 -0600
From: "Moger, Babu" <babu.moger@....com>
To: Peter Newman <peternewman@...gle.com>
Cc: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net,
 tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, fenghua.yu@...el.com, x86@...nel.org,
 hpa@...or.com, thuth@...hat.com, paulmck@...nel.org, rostedt@...dmis.org,
 akpm@...ux-foundation.org, xiongwei.song@...driver.com,
 pawan.kumar.gupta@...ux.intel.com, daniel.sneddon@...ux.intel.com,
 perry.yuan@....com, sandipan.das@....com, kai.huang@...el.com,
 xiaoyao.li@...el.com, seanjc@...gle.com, jithu.joseph@...el.com,
 brijesh.singh@....com, xin3.li@...el.com, ebiggers@...gle.com,
 andrew.cooper3@...rix.com, mario.limonciello@....com, james.morse@....com,
 tan.shaopeng@...itsu.com, tony.luck@...el.com, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, maciej.wieczor-retman@...el.com,
 eranian@...gle.com, jpoimboe@...nel.org, thomas.lendacky@....com
Subject: Re: [PATCH v9 14/26] x86/resctrl: Introduce interface to display
 number of free counters

Hi Peter,

On 12/2/24 04:43, Peter Newman wrote:
> Hi Babu,
> 
> On Fri, Nov 29, 2024 at 6:06 PM Moger, Babu <bmoger@....com> wrote:
>>
>> Hi Peter, Reinette,
>>
>> On 11/29/2024 3:59 AM, Peter Newman wrote:
>>> Hi Babu,
>>>
>>> On Thu, Nov 28, 2024 at 8:35 PM Moger, Babu <bmoger@....com> wrote:
>>>>
>>>> Hi Peter,
>>>>
>>>> On 11/28/2024 5:10 AM, Peter Newman wrote:
> 
>>>>> In my prototype, I allocated a counter id-indexed array to each
>>>>> monitoring domain structure for tracking the counter allocations,
>>>>> because the hardware counters are all domain-scoped. That way the
>>>>> tracking data goes away when the hardware does.
>>>>>
>>>>> I was focused on allowing all pending counter updates to a domain
>>>>> resulting from a single mbm_assign_control write to be batched and
>>>>> processed in a single IPI, so I structured the counter tracker
>>>>> something like this:
>>>>
>>>> Not sure what you meant here. How are you batching two IPIs for two domains?
>>>>
>>>> #echo "//0=t;1=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
>>>>
>>>> This is still a single write. Two IPIs are sent separately, one for each
>>>> domain.
>>>>
>>>> Are you doing something different?
>>>
>>> I said "all pending counter updates to a domain", whereby I meant
>>> targeting a single domain.
>>>
>>> Depending on the CPU of the caller, your example write requires 1 or 2 IPIs.
>>>
>>> What is important is that the following write also requires 1 or 2 IPIs:
>>>
>>> (assuming /sys/fs/resctrl/mon_groups/[g1-g31] exist, line breaks added
>>> for readability)
>>>
>>> echo $'//0=t;1=t\n
>>> /g1/0=t;1=t\n
>>> /g2/0=t;1=t\n
>>> /g3/0=t;1=t\n
>>> /g4/0=t;1=t\n
>>> /g5/0=t;1=t\n
>>> /g6/0=t;1=t\n
>>> /g7/0=t;1=t\n
>>> /g8/0=t;1=t\n
>>> /g9/0=t;1=t\n
>>> /g10/0=t;1=t\n
>>> /g11/0=t;1=t\n
>>> /g12/0=t;1=t\n
>>> /g13/0=t;1=t\n
>>> /g14/0=t;1=t\n
>>> /g15/0=t;1=t\n
>>> /g16/0=t;1=t\n
>>> /g17/0=t;1=t\n
>>> /g18/0=t;1=t\n
>>> /g19/0=t;1=t\n
>>> /g20/0=t;1=t\n
>>> /g21/0=t;1=t\n
>>> /g22/0=t;1=t\n
>>> /g23/0=t;1=t\n
>>> /g24/0=t;1=t\n
>>> /g25/0=t;1=t\n
>>> /g26/0=t;1=t\n
>>> /g27/0=t;1=t\n
>>> /g28/0=t;1=t\n
>>> /g29/0=t;1=t\n
>>> /g30/0=t;1=t\n
>>> /g31/0=t;1=t\n'
>>>
>>> My ultimate goal is for a thread bound to a particular domain to be
>>> able to unassign and reassign the local domain's 32 counters in a
>>> single write() with no IPIs at all. And when IPIs are required, then
>>> no more than one per domain, regardless of the number of groups
>>> updated.
>>>
>>
>> Yes. I think I got the idea. Thanks.
>>
>>>
>>>>
>>>>>
>>>>> struct resctrl_monitor_cfg {
>>>>>       int closid;
>>>>>       int rmid;
>>>>>       int evtid;
>>>>>       bool dirty;
>>>>> };
>>>>>
>>>>> This mirrors the info needed in whatever register configures the
>>>>> counter, plus a dirty flag to skip over the ones that don't need to be
>>>>> updated.
>>>>
>>>> This is what my understanding of your implementation.
>>>>
>>>> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
>>>> index d94abba1c716..9cebf065cc97 100644
>>>> --- a/include/linux/resctrl.h
>>>> +++ b/include/linux/resctrl.h
>>>> @@ -94,6 +94,13 @@ struct rdt_ctrl_domain {
>>>>           u32                             *mbps_val;
>>>>    };
>>>>
>>>> +struct resctrl_monitor_cfg {
>>>> +    int closid;
>>>> +    int rmid;
>>>> +    int evtid;
>>>> +    bool dirty;
>>>> +};
>>>> +
>>>>    /**
>>>>     * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor
>>>> resource
>>>>     * @hdr:               common header for different domain types
>>>> @@ -116,6 +123,7 @@ struct rdt_mon_domain {
>>>>           struct delayed_work             cqm_limbo;
>>>>           int                             mbm_work_cpu;
>>>>           int                             cqm_work_cpu;
>>>> +     /* Allocate num_mbm_cntrs entries in each domain */
>>>> +       struct resctrl_monitor_cfg      *mon_cfg;
>>>>    };
>>>>
>>>>
>>>> When a user requests an assignment for total event to the default group
>>>> for domain 0, you go search in rdt_mon_domain(dom 0) for empty mon_cfg
>>>> entry.
>>>>
>>>> If there is an empty entry, then use that entry for assignment and
>>>> update closid, rmid, evtid and dirty = 1. We can get all these
>>>> information from default group here.
>>>>
>>>> Does this make sense?
>>>
>>> Yes, sounds correct.
>>
>> I will probably add cntr_id in resctrl_monitor_cfg structure and
>> initialize during the allocation. And rename the field 'dirty' to
>> 'active'(or something similar) to hold the assign state for that entry.
>> That way we have all the information required for assignment at one
>> place. We don't need to update the rdtgroup structure.
> 
> It concerns me that you want to say "active" instead of "dirty". What
> I'm proposing is a write-back cache of the config values so that a
> series of remote updates to many groups can be written back to
> hardware all at once.
> 
> Therefore we want to track which entries are "dirty", implying that
> they differ from what was last written to the registers and therefore
> need to be flushed by the remote domain. Whether the counter is
> enabled or not is already implicit in the configuration values (evtid
> != 0).
> 

That is correct. But I wanted to add the "state" explicitly. Makes it easy
to search. We can overload it if you want both.

int state;

#define ASSIGN_STATE_ACTIVE  BIT(0)
#define ASSIGN_STATE_DIRTY   BIT(1)
-- 
Thanks
Babu Moger

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ