[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <70403b1c-d81f-4c5f-936e-f3cf3308822f@amd.com>
Date: Tue, 22 Apr 2025 12:06:12 -0500
From: "Moger, Babu" <bmoger@....com>
To: James Morse <james.morse@....com>, x86@...nel.org,
linux-kernel@...r.kernel.org
Cc: Reinette Chatre <reinette.chatre@...el.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, H Peter Anvin <hpa@...or.com>,
Babu Moger <Babu.Moger@....com>, shameerali.kolothum.thodi@...wei.com,
D Scott Phillips OS <scott@...amperecomputing.com>,
carl@...amperecomputing.com, lcherian@...vell.com,
bobo.shaobowang@...wei.com, tan.shaopeng@...itsu.com,
baolin.wang@...ux.alibaba.com, Jamie Iles <quic_jiles@...cinc.com>,
Xin Hao <xhao@...ux.alibaba.com>, peternewman@...gle.com,
dfustini@...libre.com, amitsinght@...vell.com,
David Hildenbrand <david@...hat.com>, Rex Nie <rex.nie@...uarmicro.com>,
Dave Martin <dave.martin@....com>, Koba Ko <kobak@...dia.com>,
Shanker Donthineni <sdonthineni@...dia.com>, fenghuay@...dia.com,
Tony Luck <tony.luck@...el.com>
Subject: Re: [PATCH v8 08/21] x86/resctrl: Expand the width of dom_id by
replacing mon_data_bits
Hi James,
On 4/11/2025 11:42 AM, James Morse wrote:
> MPAM platforms retrieve the cache-id property from the ACPI PPTT table.
> The cache-id field is 32 bits wide. Under resctrl, the cache-id becomes
> the domain-id, and is packed into the mon_data_bits union bitfield.
> The width of cache-id in this field is 14 bits.
>
> Expanding the union would break 32bit x86 platforms as this union is
> stored as the kernfs kn->priv pointer. This saved allocating memory
> for the priv data storage.
>
> The firmware on MPAM platforms have used the PPTT cache-id field to
> expose the interconnect's id for the cache, which is sparse and uses
> more than 14 bits. Use of this id is to enable PCIe direct cache
> injection hints. Using this feature with VFIO means the value provided
> by the ACPI table should be exposed to user-space.
>
> To support cache-id values greater than 14 bits, convert the
> mon_data_bits union to a structure. These are shared between control
> and monitor groups, and are allocated on first use. The list of
> allocated struct mon_data is free'd when the filesystem is umount()ed.
>
> Co-developed-by: Tony Luck <tony.luck@...el.com>
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> Signed-off-by: James Morse <james.morse@....com>
> ---
> Previously the MPAM tree repainted the cache-id to compact them,
> argue-ing there was no other user. With VFIO use of this PCIe feature,
> this is no longer an option.
>
> Changes since v7:
> * Replaced with Tony Luck's list based version.
>
> Changes since v6:
> * Added the get/put helpers.
> * Special case the creation of the mondata files for the default control
> group.
> * Removed wording about files living longer than expected, the corresponding
> error handling is wrapped in WARN_ON_ONCE() as this indicates a bug.
> ---
> arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 ++++--
> arch/x86/kernel/cpu/resctrl/internal.h | 39 ++++++------
> arch/x86/kernel/cpu/resctrl/rdtgroup.c | 78 +++++++++++++++++++++--
> 3 files changed, 102 insertions(+), 34 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> index 0a0ac5f6112e..159972c3fe73 100644
> --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c
> @@ -667,7 +667,7 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> u32 resid, evtid, domid;
> struct rdtgroup *rdtgrp;
> struct rdt_resource *r;
> - union mon_data_bits md;
> + struct mon_data *md;
> int ret = 0;
>
> rdtgrp = rdtgroup_kn_lock_live(of->kn);
> @@ -676,17 +676,22 @@ int rdtgroup_mondata_show(struct seq_file *m, void *arg)
> goto out;
> }
>
> - md.priv = of->kn->priv;
> - resid = md.u.rid;
> - domid = md.u.domid;
> - evtid = md.u.evtid;
> + md = of->kn->priv;
> + if (WARN_ON_ONCE(!md)) {
> + ret = -EIO;
> + goto out;
> + }
> +
> + resid = md->rid;
> + domid = md->domid;
> + evtid = md->evtid;
> r = resctrl_arch_get_resource(resid);
>
> - if (md.u.sum) {
> + if (md->sum) {
> /*
> * This file requires summing across all domains that share
> * the L3 cache id that was provided in the "domid" field of the
> - * mon_data_bits union. Search all domains in the resource for
> + * struct mon_data. Search all domains in the resource for
> * one that matches this cache id.
> */
> list_for_each_entry(d, &r->mon_domains, hdr.list) {
> diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h
> index 36a862a4832f..d932dd1eaa74 100644
> --- a/arch/x86/kernel/cpu/resctrl/internal.h
> +++ b/arch/x86/kernel/cpu/resctrl/internal.h
> @@ -103,27 +103,26 @@ struct mon_evt {
> };
>
> /**
> - * union mon_data_bits - Monitoring details for each event file.
> - * @priv: Used to store monitoring event data in @u
> - * as kernfs private data.
> - * @u.rid: Resource id associated with the event file.
> - * @u.evtid: Event id associated with the event file.
> - * @u.sum: Set when event must be summed across multiple
> - * domains.
> - * @u.domid: When @u.sum is zero this is the domain to which
> - * the event file belongs. When @sum is one this
> - * is the id of the L3 cache that all domains to be
> - * summed share.
> - * @u: Name of the bit fields struct.
> + * struct mon_data - Monitoring details for each event file.
> + * @list: Member of list of all allocated structures.
> + * @rid: Resource id associated with the event file.
> + * @evtid: Event id associated with the event file.
> + * @sum: Set when event must be summed across multiple
> + * domains.
> + * @domid: When @sum is zero this is the domain to which
> + * the event file belongs. When @sum is one this
> + * is the id of the L3 cache that all domains to be
> + * summed share.
> + *
> + * Stored in the kernfs kn->priv field, readers and writers must hold
> + * rdtgroup_mutex.
> */
> -union mon_data_bits {
> - void *priv;
> - struct {
> - unsigned int rid : 10;
> - enum resctrl_event_id evtid : 7;
> - unsigned int sum : 1;
> - unsigned int domid : 14;
> - } u;
> +struct mon_data {
> + struct list_head list;
> + unsigned int rid;
> + enum resctrl_event_id evtid;
> + unsigned int sum;
> + unsigned int domid;
> };
>
> /**
> diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> index c69ed978aa50..aa0bc57e1c7f 100644
> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c
> @@ -45,6 +45,12 @@ LIST_HEAD(rdt_all_groups);
> /* list of entries for the schemata file */
> LIST_HEAD(resctrl_schema_all);
>
> +/*
> + * List of struct mon_data 'priv' structures for rdtgroup_mondata_show().
> + * Protected by rdtgroup_mutex.
> + */
> +static LIST_HEAD(kn_priv_list);
> +
Do we really need to maintain a separate list for all the private pointers?
Here's my understanding of the patch—please correct me if I’m missing
anything:
Patch Requirements:
1. Expand dom_id.
2. Pack all necessary data (dom_id, event_id, resid) into the
of->kn->priv pointer when creating event files in the mon_data
directory for each domain.
3. Dynamically allocate the priv structure during event file creation
for each domain.
4. Free the priv structure when the mon_data directory is deleted in
each domain.
From what I can see, the global list "kn_priv_list" seems relevant only
for step 4.
Wouldn’t it be possible to handle this directly in rdtgroup_rmdir_mon()
and rdtgroup_rmdir_ctrl()?
We could retrieve the kernfs_node for each file using kernfs_find_and_get().
Thanks,
Babu
Powered by blists - more mailing lists