linux-kernel - Re: [PATCH 1/3] soc cache: L3 cache driver for HiSilicon SoC

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2a42c4d7-c453-45a7-b569-89a3a2ee7246@huawei.com>
Date: Fri, 6 Feb 2026 17:54:55 +0800
From: wangyushan <wangyushan12@...wei.com>
To: Jonathan Cameron <jonathan.cameron@...wei.com>, Linus Walleij
	<linusw@...nel.org>
CC: <alexandre.belloni@...tlin.com>, <arnd@...db.de>, <fustini@...nel.org>,
	<krzk@...nel.org>, <linus.walleij@...aro.org>, <will@...nel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
	<fanghao11@...wei.com>, <linuxarm@...wei.com>, <liuyonglong@...wei.com>,
	<prime.zeng@...ilicon.com>, <wangzhou1@...ilicon.com>,
	<xuwei5@...ilicon.com>, <linux-mm@...r.kernel.org>, SeongJae Park
	<sj@...nel.org>, <reinette.chatre@...el.com>, <james.morse@....com>, "Zeng
 Heng" <zengheng4@...wei.com>, <ben.horgan@....com>, Tony Luck
	<tony.luck@...el.com>, Dave Martin <Dave.Martin@....com>, Babu Moger
	<babu.moger@....com>, Yushan Wang <wangyushan12@...wei.com>
Subject: Re: [PATCH 1/3] soc cache: L3 cache driver for HiSilicon SoC


On 2/5/2026 6:18 PM, Jonathan Cameron wrote:
> On Thu, 5 Feb 2026 10:12:33 +0100
> Linus Walleij <linusw@...nel.org> wrote:
>
>> But does the developer know if that hard kernel is importantest
>> taken into account all other processes running on the system,
>> and what happens if several processes say they have
>> such hard kernels? Who will arbitrate? That is usually the
>> kernels job.
>
> Take the closest example to this which is resctl (mpam on arm).
> This actually has a feature that smells a bit like this.
> Pseudo-cache locking.
>
> https://docs.kernel.org/filesystems/resctrl.html#cache-pseudo-locking
>
> My understanding is that the semantics of that don't align perfectly
> with what we have here.  Yushan can you add more on why we didn't
> try to fit into that scheme?  Other than the obvious bit that more
> general upstream support for the arch definitions of MPAM is a work in
> progress and fitting vendor specific features on top will be tricky
> for a while at least.  The hardware here is also independent of the
> MPAM support.

Intel cache pseudo lock requires help of IA32_PQR_ASSOC MSR, according
to [1], that register can save necessary information for processes acquired
cache pseudo locks, but Arm64 does not have the equivalent register.

[1]: https://www.intel.com/content/www/us/en/developer/articles/technical/cache-allocation-technology-usage-models.html

>
> Resctl puts the control on resource allocation into the hands of
> userspace (in that case via cgroups etc as it's process level controls).
> The cache lockdown is a weird because you have go through a dance of
> creating a temporary setup, demand fetching the lines into cache and
> then rely on various operations not occuring that might push them out
> again.
>
> Resctl provides many footguns and is (I believe) used by administrators
> who are very careful in how they use it.  Note that there are some guards
> in this new code to only allow locking a portion of the l3. We also rely
> somewhat on the uarch and cache design to ensure it is safe to do this
> type of locking (other than reducing perf of other tasks).
> I'm dancing around uarch details here that I would need to go seek
> agreement to share more on.
>
>>
>>> I haven't yet come up with any plausible scheme by which the MM
>>> subsystem could do this.
>>
>> I find it kind of worrying if userspace knows which lines are most
>> performance-critical but the kernel MM subsystem does not.
>>
>> That strongly inidicates that if only userspace knows that, then
>> madvise() is the way to go. The MM might need and use this
>> information for other reasons than just locking down lines in
>> the L3 cache.
>
> I agree that something like madvise() may well be more suitable.
> We do need paths to know how many regions are left etc though so
> it will need a few other bits of interface.
>
> I'm also not sure what appetite there will be for an madvise()
> for something that today we have no idea if anyone else actually
> has hardware for.  If people do, then please shout and we can
> look at how something like this can be generalized.

Currently madvise() "only operates on whole pages", maybe
madvise() will not be happy with the semantic change of
page / cacheline.

Cache size available for lock may be far less than the size
madvise() can handle. Though madvise() can always speculatively call
cache lock once appropriate and get back to original track if refused,
but that's a hack that need more deep discussion.

I think resctl is more suitable for this, as this serves the same
purpose as MPAM etc, to save QoS of a task, and the way to achieve it,
by tweaking hardware capability.

Thanks,
Yushan