[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19565646-074d-4c53-aa30-2b6145429666@intel.com>
Date: Thu, 20 Jun 2024 14:31:13 -0700
From: Reinette Chatre <reinette.chatre@...el.com>
To: Tony Luck <tony.luck@...el.com>, Fenghua Yu <fenghua.yu@...el.com>,
"Maciej Wieczor-Retman" <maciej.wieczor-retman@...el.com>, Peter Newman
<peternewman@...gle.com>, James Morse <james.morse@....com>, Babu Moger
<babu.moger@....com>, Drew Fustini <dfustini@...libre.com>, Dave Martin
<Dave.Martin@....com>
CC: <x86@...nel.org>, <linux-kernel@...r.kernel.org>,
<patches@...ts.linux.dev>
Subject: Re: [PATCH v20 14/18] x86/resctrl: Fill out rmid_read structure for
smp_call*() to read a counter
Hi Tony,
On 6/10/24 11:35 AM, Tony Luck wrote:
> mon_event_read() fills out most fields of the struct rmid_read that is
> passed via an smp_call*() function to a CPU that is part of the correct
> domain to read the monitor counters.
>
> With Sub-NUMA Cluster (SNC) mode there are now two cases to handle:
>
> 1) Reading a file that returns a value for a single domain.
> + Choose the CPU to execute from the domain cpu_mask
>
> 2) Reading a file that must sum across domains sharing an L3 cache
> instance.
> + Indicate to called code that a sum is needed by passing a NULL
> rdt_mon_domain pointer.
> + Choose the CPU from the L3 shared_cpu_map.
>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
The kerneldoc in patch #9 introduces new requirements related to
struct rmid_read wrt how NULL values are interpreted and
used. This makes it essential that struct rmid_read is always initialized
correctly and should no longer consist of whatever is on the stack. I
mentioned in response to v19 that static checkers found issues here.
I understand that mbm_update() always sets the domain in
struct rmid_read, but I do not find it acceptable that it
passes garbage as the cacheinfo pointer based on subtle assumptions on
when/how __mon_event_count() uses this field.
> ---
> arch/x86/kernel/cpu/resctrl/internal.h | 2 +-
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 39 ++++++++++++++++++-----
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +-
> 3 files changed, 33 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 77da29ced7eb..75bb1afc4842 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -627,7 +627,7 @@ void mon_event_count(void *info);
> int rdtgroup_mondata_show(struct seq_file *m, void *arg);
> void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
> - int evtid, int first);
> + cpumask_t *cpumask, int evtid, int first);
> void mbm_setup_overflow_handler(struct rdt_mon_domain *dom,
> unsigned long delay_ms,
> int exclude_cpu);
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 3b9383612c35..5a43931fd423 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -515,7 +515,7 @@ static int smp_mon_event_count(void *arg)
>
> void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> struct rdt_mon_domain *d, struct rdtgroup *rdtgrp,
> - int evtid, int first)
> + cpumask_t *cpumask, int evtid, int first)
> {
> int cpu;
>
> @@ -537,7 +537,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> return;
> }
>
> - cpu = cpumask_any_housekeeping(&d->hdr.cpu_mask, RESCTRL_PICK_ANY_CPU);
> + cpu = cpumask_any_housekeeping(cpumask, RESCTRL_PICK_ANY_CPU);
>
> /*
> * cpumask_any_housekeeping() prefers housekeeping CPUs, but
> @@ -546,7 +546,7 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
> * counters on some platforms if its called in IRQ context.
> */
> if (tick_nohz_full_cpu(cpu))
> - smp_call_function_any(&d->hdr.cpu_mask, mon_event_count, rr, 1);
> + smp_call_function_any(cpumask, mon_event_count, rr, 1);
> else
> smp_call_on_cpu(cpu, smp_mon_event_count, rr, false);
>
> @@ -575,16 +575,39 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> resid = md.u.rid;
> domid = md.u.domid;
> evtid = md.u.evtid;
> -
> r = &rdt_resources_all[resid].r_resctrl;
> - hdr = rdt_find_domain(&r->mon_domains, domid, NULL);
> - if (!hdr || WARN_ON_ONCE(hdr->type != RESCTRL_MON_DOMAIN)) {
> +
> + if (md.u.sum) {
> + /*
> + * This file requires summing across all SNC domains that share
> + * the L3 cache id that was provided in the "domid" field of the
> + * mon_data_bits union. Search all domains in the resource for
> + * one that matches this cache id.
> + */
> + list_for_each_entry(d, &r->mon_domains, hdr.list) {
> + if (d->ci->id == domid) {
> + rr.ci = d->ci;
> + mon_event_read(&rr, r, NULL, rdtgrp, &d->ci->shared_cpu_map, evtid, false);
Please split this line, it is over 100
Reinette
Powered by blists - more mailing lists