linux-kernel - Re: [PATCH] arm64: cacheinfo: Report cache sets, ways, and line size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <96b21aa0-97d8-415d-9fbf-529b0434b25f@linux.dev>
Date: Mon, 19 May 2025 16:50:57 -0400
From: Sean Anderson <sean.anderson@...ux.dev>
To: Will Deacon <will@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>, Sudeep Holla <sudeep.holla@....com>,
 Catalin Marinas <catalin.marinas@....com>,
 linux-arm-kernel@...ts.infradead.org, Radu Rendec <rrendec@...hat.com>,
 Thomas Weißschuh <thomas.weissschuh@...utronix.de>,
 Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: cacheinfo: Report cache sets, ways, and line size

On 5/14/25 08:38, Will Deacon wrote:
> On Mon, May 12, 2025 at 11:56:28AM -0400, Sean Anderson wrote:
>> On 5/12/25 11:36, Mark Rutland wrote:
>> > On Mon, May 12, 2025 at 11:28:36AM -0400, Sean Anderson wrote:
>> >> On 5/10/25 03:04, Sudeep Holla wrote:
>> >> > On Fri, May 09, 2025 at 07:37:35PM -0400, Sean Anderson wrote:
>> >> >> Cache geometry is exposed through the Cache Size ID register. There is
>> >> >> one register for each cache, and they are selected through the Cache
>> >> >> Size Selection register. If FEAT_CCIDX is implemented, the layout of
>> >> >> CCSIDR changes to allow a larger number of sets and ways.
>> >> >> 
>> >> > 
>> >> > Please refer
>> >> > Commit a8d4636f96ad ("arm64: cacheinfo: Remove CCSIDR-based cache information probing")
>> >> > 
>> >> 
>> >> | The CCSIDR_EL1.{NumSets,Associativity,LineSize} fields are only for use
>> >> | in conjunction with set/way cache maintenance and are not guaranteed to
>> >> | represent the actual microarchitectural features of a design.
>> >> | 
>> >> | The architecture explicitly states:
>> >> | 
>> >> | | You cannot make any inference about the actual sizes of caches based
>> >> | | on these parameters.
>> >> 
>> >> However, on many cores (A53, A72, and surely others that I haven't
>> >> checked) these *do* expose the actual microarchitectural features of the
>> >> design. Maybe a whitelist would be suitable.
>> > 
>> > Then we have to maintain a whitelist forever,
>> 
>> There's no maintenance involved. The silicon is already fabbed, so it's
>> not like it's going to change any time soon.
> 
> The list is going to change though and it introduces divergent behaviour
> that I'd much rather avoid. The mechanism is there for firmware to
> provide the information and it's hardly onerous compared with other
> (critical) things that it's tasked to provide such as interrupt routing
> and GPIOs.

The mechanism is also there for us to discover the cache sizes without
requiring any devicetree involvement.

>> > and running an old/distro
>> > kernel on new HW won't give you useful values unless you provide
>> > equivalent values in DT, in which case the kernel doesn't need to read
>> > the registers anyway.
>> 
>> Conversely (and far more likely IMO), running an old/distro devicetree
>> on a new kernel won't give you usefult values. Bootloaders tend not be
>> be updated very often (if ever), whereas kernels can (ideally) be
>> updated without changing userspace.
> 
> Updating the device-tree shouldn't require updating the bootloader.

Very often the release cycle for the devicetree is tied to the bootloader.
So they may not be updated very often.

>> > The architecture explcitly tells us not to use the values in this way,
>> > and it's possible to place the values into DT when you know they're
>> > meaningful.
>> 
>> Well, maybe we can just use these registers for the hundreds of existing
>> devicetrees that lack values.
> 
> The fact that the device-tree files tend to omit this information is
> quite telling as to how useful it actually is. What would you like to
> use it for?

Say you have a program that works on batches of data. You may want to
adjust the size of the batch to fit in the L1 (or L2) cache. One way to
do this is to benchmark various batch sizes and select an appropriate
size. But it would be more convenient to the user to pick a batch size
automatically without having to run a benchmark, just by reading from
sysfs.

> Short of having an immediate functional or performance benefit by
> exposing this stuff, I wonder if we could add a kselftest for it
> instead?

I'm not sure how well that will improve adoption. Do people even run
kselftest during board bringup?

--Sean