[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALPaoCjvpp3eKKV9L_UJ5Axp1u8XBDtgsuHrhOJhmDrFVphrzA@mail.gmail.com>
Date: Mon, 2 Dec 2024 11:43:26 +0100
From: Peter Newman <peternewman@...gle.com>
To: babu.moger@....com
Cc: Reinette Chatre <reinette.chatre@...el.com>, corbet@....net, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
fenghua.yu@...el.com, x86@...nel.org, hpa@...or.com, thuth@...hat.com,
paulmck@...nel.org, rostedt@...dmis.org, akpm@...ux-foundation.org,
xiongwei.song@...driver.com, pawan.kumar.gupta@...ux.intel.com,
daniel.sneddon@...ux.intel.com, perry.yuan@....com, sandipan.das@....com,
kai.huang@...el.com, xiaoyao.li@...el.com, seanjc@...gle.com,
jithu.joseph@...el.com, brijesh.singh@....com, xin3.li@...el.com,
ebiggers@...gle.com, andrew.cooper3@...rix.com, mario.limonciello@....com,
james.morse@....com, tan.shaopeng@...itsu.com, tony.luck@...el.com,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
maciej.wieczor-retman@...el.com, eranian@...gle.com, jpoimboe@...nel.org,
thomas.lendacky@....com
Subject: Re: [PATCH v9 14/26] x86/resctrl: Introduce interface to display
number of free counters
Hi Babu,
On Fri, Nov 29, 2024 at 6:06 PM Moger, Babu <bmoger@....com> wrote:
>
> Hi Peter, Reinette,
>
> On 11/29/2024 3:59 AM, Peter Newman wrote:
> > Hi Babu,
> >
> > On Thu, Nov 28, 2024 at 8:35 PM Moger, Babu <bmoger@....com> wrote:
> >>
> >> Hi Peter,
> >>
> >> On 11/28/2024 5:10 AM, Peter Newman wrote:
> >>> In my prototype, I allocated a counter id-indexed array to each
> >>> monitoring domain structure for tracking the counter allocations,
> >>> because the hardware counters are all domain-scoped. That way the
> >>> tracking data goes away when the hardware does.
> >>>
> >>> I was focused on allowing all pending counter updates to a domain
> >>> resulting from a single mbm_assign_control write to be batched and
> >>> processed in a single IPI, so I structured the counter tracker
> >>> something like this:
> >>
> >> Not sure what you meant here. How are you batching two IPIs for two domains?
> >>
> >> #echo "//0=t;1=t" > /sys/fs/resctrl/info/L3_MON/mbm_assign_control
> >>
> >> This is still a single write. Two IPIs are sent separately, one for each
> >> domain.
> >>
> >> Are you doing something different?
> >
> > I said "all pending counter updates to a domain", whereby I meant
> > targeting a single domain.
> >
> > Depending on the CPU of the caller, your example write requires 1 or 2 IPIs.
> >
> > What is important is that the following write also requires 1 or 2 IPIs:
> >
> > (assuming /sys/fs/resctrl/mon_groups/[g1-g31] exist, line breaks added
> > for readability)
> >
> > echo $'//0=t;1=t\n
> > /g1/0=t;1=t\n
> > /g2/0=t;1=t\n
> > /g3/0=t;1=t\n
> > /g4/0=t;1=t\n
> > /g5/0=t;1=t\n
> > /g6/0=t;1=t\n
> > /g7/0=t;1=t\n
> > /g8/0=t;1=t\n
> > /g9/0=t;1=t\n
> > /g10/0=t;1=t\n
> > /g11/0=t;1=t\n
> > /g12/0=t;1=t\n
> > /g13/0=t;1=t\n
> > /g14/0=t;1=t\n
> > /g15/0=t;1=t\n
> > /g16/0=t;1=t\n
> > /g17/0=t;1=t\n
> > /g18/0=t;1=t\n
> > /g19/0=t;1=t\n
> > /g20/0=t;1=t\n
> > /g21/0=t;1=t\n
> > /g22/0=t;1=t\n
> > /g23/0=t;1=t\n
> > /g24/0=t;1=t\n
> > /g25/0=t;1=t\n
> > /g26/0=t;1=t\n
> > /g27/0=t;1=t\n
> > /g28/0=t;1=t\n
> > /g29/0=t;1=t\n
> > /g30/0=t;1=t\n
> > /g31/0=t;1=t\n'
> >
> > My ultimate goal is for a thread bound to a particular domain to be
> > able to unassign and reassign the local domain's 32 counters in a
> > single write() with no IPIs at all. And when IPIs are required, then
> > no more than one per domain, regardless of the number of groups
> > updated.
> >
>
> Yes. I think I got the idea. Thanks.
>
> >
> >>
> >>>
> >>> struct resctrl_monitor_cfg {
> >>> int closid;
> >>> int rmid;
> >>> int evtid;
> >>> bool dirty;
> >>> };
> >>>
> >>> This mirrors the info needed in whatever register configures the
> >>> counter, plus a dirty flag to skip over the ones that don't need to be
> >>> updated.
> >>
> >> This is what my understanding of your implementation.
> >>
> >> diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
> >> index d94abba1c716..9cebf065cc97 100644
> >> --- a/include/linux/resctrl.h
> >> +++ b/include/linux/resctrl.h
> >> @@ -94,6 +94,13 @@ struct rdt_ctrl_domain {
> >> u32 *mbps_val;
> >> };
> >>
> >> +struct resctrl_monitor_cfg {
> >> + int closid;
> >> + int rmid;
> >> + int evtid;
> >> + bool dirty;
> >> +};
> >> +
> >> /**
> >> * struct rdt_mon_domain - group of CPUs sharing a resctrl monitor
> >> resource
> >> * @hdr: common header for different domain types
> >> @@ -116,6 +123,7 @@ struct rdt_mon_domain {
> >> struct delayed_work cqm_limbo;
> >> int mbm_work_cpu;
> >> int cqm_work_cpu;
> >> + /* Allocate num_mbm_cntrs entries in each domain */
> >> + struct resctrl_monitor_cfg *mon_cfg;
> >> };
> >>
> >>
> >> When a user requests an assignment for total event to the default group
> >> for domain 0, you go search in rdt_mon_domain(dom 0) for empty mon_cfg
> >> entry.
> >>
> >> If there is an empty entry, then use that entry for assignment and
> >> update closid, rmid, evtid and dirty = 1. We can get all these
> >> information from default group here.
> >>
> >> Does this make sense?
> >
> > Yes, sounds correct.
>
> I will probably add cntr_id in resctrl_monitor_cfg structure and
> initialize during the allocation. And rename the field 'dirty' to
> 'active'(or something similar) to hold the assign state for that entry.
> That way we have all the information required for assignment at one
> place. We don't need to update the rdtgroup structure.
It concerns me that you want to say "active" instead of "dirty". What
I'm proposing is a write-back cache of the config values so that a
series of remote updates to many groups can be written back to
hardware all at once.
Therefore we want to track which entries are "dirty", implying that
they differ from what was last written to the registers and therefore
need to be flushed by the remote domain. Whether the counter is
enabled or not is already implicit in the configuration values (evtid
!= 0).
-Peter
Powered by blists - more mailing lists