linux-kernel - Re: [PATCH 1/5] cacheinfo: Set cache 'id' based on DT data

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e21d05c8-771a-4a2b-8573-76f7d5668a3d@arm.com>
Date: Fri, 27 Jun 2025 17:38:55 +0100
From: James Morse <james.morse@....com>
To: Rob Herring <robh@...nel.org>,
 Jonathan Cameron <Jonathan.Cameron@...wei.com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 "Rafael J . Wysocki" <rafael@...nel.org>, sudeep.holla@....com,
 Ben Horgan <ben.horgan@....com>
Subject: Re: [PATCH 1/5] cacheinfo: Set cache 'id' based on DT data

Hi Jonathan, Rob,

On 23/06/2025 15:18, Rob Herring wrote:
> On Tue, Jun 17, 2025 at 11:03 AM Jonathan Cameron
> <Jonathan.Cameron@...wei.com> wrote:
>> On Fri, 13 Jun 2025 13:03:52 +0000
>> James Morse <james.morse@....com> wrote:
>>> From: Rob Herring <robh@...nel.org>
>>>
>>> Use the minimum CPU h/w id of the CPUs associated with the cache for the
>>> cache 'id'. This will provide a stable id value for a given system. As
>>> we need to check all possible CPUs, we can't use the shared_cpu_map
>>> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
>>> we have to walk all CPU nodes and then walk cache levels.
>>
>> Is it ok for these to match for different levels?  I've no idea for
>> these use cases.

> Yes. The 'id' is per level, not globally unique.

Documented here:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/resctrl.rst#n482

The values are unique per-level, but also sparse, meaning they could be unique
system-wide. This is what will happen on arm64 ACPI platforms because the PPTT value gets
exposed directly.


>>> The cache_id exposed to user-space has historically been 32 bits, and
>>> is too late to change. Give up on assigning cache-id's if a CPU h/w
>>> id greater than 32 bits is found.

>> Mainly a couple of questions for Rob on the fun of scoped cleanup being
>> used for some of the iterators in a similar fashion to already
>> done for looping over child nodes etc.

This is mostly over my head!


>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>>> index cf0d455209d7..9888d87840a2 100644
>>> --- a/drivers/base/cacheinfo.c
>>> +++ b/drivers/base/cacheinfo.c
>>> @@ -183,6 +184,37 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
>>>       return of_property_read_bool(np, "cache-unified");
>>>  }
>>>
>>> +static void cache_of_set_id(struct cacheinfo *this_leaf, struct device_node *np)
>>> +{
>>> +     struct device_node *cpu;
>>> +     u32 min_id = ~0;
>>> +
>>> +     for_each_of_cpu_node(cpu) {
>>
>> Rob is it worth a scoped variant of this one?  I've come across
>> this a few times recently and it irritates me but I didn't feel
>> there were necessarily enough cases to bother.  With one more
>> maybe it is time to do it (maybe 10+ from a quick look)_.
> 
> My question on all of these (though more so for drivers), is why are
> we parsing CPU nodes again? Ideally, we'd parse the CPU and cache
> nodes only once and the kernel would provide the necessary info.
> 
> Take drivers/clk/mvebu/ap-cpu-clk.c for example. The code there really
> just needs to know if there are 2 or 4 possible CPUs or what is the
> max physical CPU id (If CPU #1 could be not present).
> 
>>> +             struct device_node *cache_node __free(device_node) = of_find_next_cache_node(cpu);
>>> +             u64 id = of_get_cpu_hwid(cpu, 0);
>>> +
>>> +             if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
>>> +                     of_node_put(cpu);
>>> +                     return;
>>> +             }
>>> +             while (1) {
>>
>> for_each_of_cache_node_scoped() perhaps?  With the find already defined this would end
>> up something like the following.  Modeled on for_each_child_of_node_scoped.
> 
> That seems like an invitation for someone to parse the cache nodes
> themselves rather than use cacheinfo. Plus, there are multiple ways we
> could iterate over cache nodes. Is it just ones associated with a CPU
> or all cache nodes or all cache nodes at a level?
> 
> That being said, I do find the current loop a bit odd with setting
> 'prev' pointer which is then never explicitly used. We're still having
> to worry about refcounting, but handling it in a less obvious way.
> 
>>         #define for_each_of_cache_node_scoped(cpu, cache) \
>>                 for (struct device_node *cache __free(device_node) = \
>>                      of_find_next_cache_node(cpu); cache != NULL; \
>>                      cache = of_find_next_cache_node(cache))
>>
>>         for_each_of_cpu_node_scoped(cpu) {
>>                 u64 id = of_get_cpu_hwid(cpu, 0);
>>
>>                 if (FIELD_GET(GENMASK_ULL(63, 32), id))
>>                         return;
>>                 for_each_of_cache_node_scoped(cpu, cache_node) {
>>                         if (cache_node == np) {
>>                                 min_id = min(min_id, id);
>>                                 break;
>>                         }
>>                 }
>>         }
>>
>>> +                     if (!cache_node)
>>> +                             break;
>>> +                     if (cache_node == np && id < min_id) {
>>
>> Why do you carry on if id >= min_id?  Id isn't changing.  For that
>> matter why not do this check before iterating the caches at all?
> 
> You're right, no need. There's no need to handle the id in the loop at
> all, we just need to match the cache node. So perhaps just a helper:
> 
> static bool match_cache_node(struct device_node *cpu, const struct
> device_node *cache_node)
> {
>   for (struct device_node *cache __free(device_node) =
>         of_find_next_cache_node(cpu); cache != NULL;
>         cache = of_find_next_cache_node(cache)) {
>     if (cache == cache_node)
>       return true;
>   }
>   return false;
> }
> 
> And then the cpu loop would have:
> 
> if (match_cache_node(cpu, cache_node))
>   min_id = min(min_id, id);
Done,


Thanks!

James