linux-kernel - Re: [PATCH 1/5] cacheinfo: Set cache 'id' based on DT data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAL_Jsq+vA1AR+P0se3njU-_P0cjBM1h-4YGAShFZ-gy53SkZfw@mail.gmail.com>
Date: Mon, 23 Jun 2025 09:18:59 -0500
From: Rob Herring <robh@...nel.org>
To: Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: James Morse <james.morse@....com>, linux-kernel@...r.kernel.org, 
	linux-arm-kernel@...ts.infradead.org, 
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "Rafael J . Wysocki" <rafael@...nel.org>, sudeep.holla@....com, 
	Ben Horgan <ben.horgan@....com>
Subject: Re: [PATCH 1/5] cacheinfo: Set cache 'id' based on DT data

On Tue, Jun 17, 2025 at 11:03 AM Jonathan Cameron
<Jonathan.Cameron@...wei.com> wrote:
>
> On Fri, 13 Jun 2025 13:03:52 +0000
> James Morse <james.morse@....com> wrote:
>
> > From: Rob Herring <robh@...nel.org>
> >
> > Use the minimum CPU h/w id of the CPUs associated with the cache for the
> > cache 'id'. This will provide a stable id value for a given system. As
> > we need to check all possible CPUs, we can't use the shared_cpu_map
> > which is just online CPUs. As there's not a cache to CPUs mapping in DT,
> > we have to walk all CPU nodes and then walk cache levels.
>
> Is it ok for these to match for different levels?  I've no idea for
> these use cases.

Yes. The 'id' is per level, not globally unique.

> >
> > The cache_id exposed to user-space has historically been 32 bits, and
> > is too late to change. Give up on assigning cache-id's if a CPU h/w
> > id greater than 32 bits is found.
> >
> > Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> > Cc: "Rafael J. Wysocki" <rafael@...nel.org>
> > Signed-off-by: Rob Herring <robh@...nel.org>
> > [ ben: converted to use the __free cleanup idiom ]
> > Signed-off-by: Ben Horgan <ben.horgan@....com>
> > [ morse: Add checks to give up if a value larger than 32 bits is seen. ]
> > Signed-off-by: James Morse <james.morse@....com>
>
> Hi James, Rob,
>
> Mainly a couple of questions for Rob on the fun of scoped cleanup being
> used for some of the iterators in a similar fashion to already
> done for looping over child nodes etc.
>
> > ---
> > Use as a 32bit value has been seen in DPDK patches here:
> > http://inbox.dpdk.org/dev/20241021015246.304431-2-wathsala.vithanage@arm.com/
> > ---
> >  drivers/base/cacheinfo.c | 33 +++++++++++++++++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> >
> > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> > index cf0d455209d7..9888d87840a2 100644
> > --- a/drivers/base/cacheinfo.c
> > +++ b/drivers/base/cacheinfo.c
> > @@ -8,6 +8,7 @@
> >  #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >
> >  #include <linux/acpi.h>
> > +#include <linux/bitfield.h>
> >  #include <linux/bitops.h>
> >  #include <linux/cacheinfo.h>
> >  #include <linux/compiler.h>
> > @@ -183,6 +184,37 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
> >       return of_property_read_bool(np, "cache-unified");
> >  }
> >
> > +static void cache_of_set_id(struct cacheinfo *this_leaf, struct device_node *np)
> > +{
> > +     struct device_node *cpu;
> > +     u32 min_id = ~0;
> > +
> > +     for_each_of_cpu_node(cpu) {
>
> Rob is it worth a scoped variant of this one?  I've come across
> this a few times recently and it irritates me but I didn't feel
> there were necessarily enough cases to bother.  With one more
> maybe it is time to do it (maybe 10+ from a quick look)_.

My question on all of these (though more so for drivers), is why are
we parsing CPU nodes again? Ideally, we'd parse the CPU and cache
nodes only once and the kernel would provide the necessary info.

Take drivers/clk/mvebu/ap-cpu-clk.c for example. The code there really
just needs to know if there are 2 or 4 possible CPUs or what is the
max physical CPU id (If CPU #1 could be not present).

> > +             struct device_node *cache_node __free(device_node) = of_find_next_cache_node(cpu);
> > +             u64 id = of_get_cpu_hwid(cpu, 0);
> > +
> > +             if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
> > +                     of_node_put(cpu);
> > +                     return;
> > +             }
> > +             while (1) {
>
> for_each_of_cache_node_scoped() perhaps?  With the find already defined this would end
> up something like the following.  Modeled on for_each_child_of_node_scoped.

That seems like an invitation for someone to parse the cache nodes
themselves rather than use cacheinfo. Plus, there are multiple ways we
could iterate over cache nodes. Is it just ones associated with a CPU
or all cache nodes or all cache nodes at a level?

That being said, I do find the current loop a bit odd with setting
'prev' pointer which is then never explicitly used. We're still having
to worry about refcounting, but handling it in a less obvious way.

>         #define for_each_of_cache_node_scoped(cpu, cache) \
>                 for (struct device_node *cache __free(device_node) = \
>                      of_find_next_cache_node(cpu); cache != NULL; \
>                      cache = of_find_next_cache_node(cache))
>
>         for_each_of_cpu_node_scoped(cpu) {
>                 u64 id = of_get_cpu_hwid(cpu, 0);
>
>                 if (FIELD_GET(GENMASK_ULL(63, 32), id))
>                         return;
>                 for_each_of_cache_node_scoped(cpu, cache_node) {
>                         if (cache_node == np) {
>                                 min_id = min(min_id, id);
>                                 break;
>                         }
>                 }
>         }
>
> > +                     if (!cache_node)
> > +                             break;
> > +                     if (cache_node == np && id < min_id) {
>
> Why do you carry on if id >= min_id?  Id isn't changing.  For that
> matter why not do this check before iterating the caches at all?

You're right, no need. There's no need to handle the id in the loop at
all, we just need to match the cache node. So perhaps just a helper:

static bool match_cache_node(struct device_node *cpu, const struct
device_node *cache_node)
{
  for (struct device_node *cache __free(device_node) =
        of_find_next_cache_node(cpu); cache != NULL;
        cache = of_find_next_cache_node(cache)) {
    if (cache == cache_node)
      return true;
  }
  return false;
}

And then the cpu loop would have:

if (match_cache_node(cpu, cache_node))
  min_id = min(min_id, id);

Rob