lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD++jLkEtpiTxaNB6vfHnbmoV1PPB7W4T_04eQrwk2os_zpfpA@mail.gmail.com>
Date: Thu, 5 Feb 2026 10:12:33 +0100
From: Linus Walleij <linusw@...nel.org>
To: Jonathan Cameron <jonathan.cameron@...wei.com>
Cc: Yushan Wang <wangyushan12@...wei.com>, alexandre.belloni@...tlin.com, arnd@...db.de, 
	fustini@...nel.org, krzk@...nel.org, linus.walleij@...aro.org, 
	will@...nel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-kernel@...r.kernel.org, fanghao11@...wei.com, linuxarm@...wei.com, 
	liuyonglong@...wei.com, prime.zeng@...ilicon.com, wangzhou1@...ilicon.com, 
	xuwei5@...ilicon.com, linux-mm@...r.kernel.org, SeongJae Park <sj@...nel.org>
Subject: Re: [PATCH 1/3] soc cache: L3 cache driver for HiSilicon SoC

Hi Jonathan,

thanks for stepping in, I'm trying to be healthy sceptical here...

What you and others need to do is to tell me if I'm being too
critical. But right now it feels like I need some more senior
MM developers to tell me to be a good boy and let this
hack patch slip before I shut up ;)

On Wed, Feb 4, 2026 at 2:40 PM Jonathan Cameron
<jonathan.cameron@...wei.com> wrote:

> > The MM subsytem knows which memory is most cache hot.
> > Especially when you use DAMON DAMOS, which has the sole
> > purpose of executing actions like that. Here is a good YouTube.
> > https://www.youtube.com/watch?v=xKJO4kLTHOI
>
> This typically isn't about cache hot.  It it were, the data would
> be in the cache without this. It's about ensuring something that would
> otherwise unlikely to be there is in the cache.

OK I get it.

> Normally that's a latency critical region.  In general the kernel
> has no chance of figuring out what those are ahead of time, only
> userspace can know (based on profiling etc) that is per workload.
(...)
> The only thing we could do if this was in kernel would be to
> have userspace pass some hints and then let the kernel actually
> kick off the process.
(...)
> and you absolutely need userspace to be able to tell if it
> got what it asked for or not.
(...)
> Its an extreme form of profile guided optimization (and not
> currently automatic I think?). If we are putting code in this
> locked region, the program has been carefully recompiled / linked
> to group the critical parts so that we can use the minimum number
> of these locked regions. Data is a little simpler.

OK so the argument is "only userspace knows what cache lines
are performance critical, and therefore this info must be passed
from userspace". Do I understand correctly?

What I'm worried about here is that "an extreme form of profile
guided optimization" is a bit handwavy. I would accept if it is
based on simulation or simply human know-how, such as
if a developer puts signal-processing algorithm kernels
there because they know it is going to be the hard kernel
of the process.

But does the developer know if that hard kernel is importantest
taken into account all other processes running on the system,
and what happens if several processes say they have
such hard kernels? Who will arbitrate? That is usually the
kernels job.

> I haven't yet come up with any plausible scheme by which the MM
> subsystem could do this.

I find it kind of worrying if userspace knows which lines are most
performance-critical but the kernel MM subsystem does not.

That strongly inidicates that if only userspace knows that, then
madvise() is the way to go. The MM might need and use this
information for other reasons than just locking down lines in
the L3 cache.

In my mind:

Userspace madvise -> Linux MM -> arch cache-line lockdown

So the MM needs to take the decision that this indication from
userspace is something that should result in asking the arch
to lock down these cache lines, as well as re-evaluate it if
new processes start sending the same madise() calls and we
run out in lock-downable cache lines.

L3 lock-downs is a finite resource after all, and it needs to be
arbitrated. Just OTOMH, maybe if several processes ask for this
simultaneously and we run out of lockdownable cache lines,
who wins? First come first served? The process with the highest
nice value or realtime priority? Etc.

I.e. the kernel MM needs to arbitrate any cache lockdown.

Bypassing the whole MM like this patch does is a hack designed
for one single process that the user "knows" is "importantest"
and will be the only process asking for cache lines to be locked
down.

And this isn't abstract and it does not scale. We can't do that.

That's the kind of resource management we expect from the
kernel.

MM might want to use that information for other things.

> I think what we need here Yushan, is more detail on end to end
> use cases for this.  Some examples etc as clearer motivation.

I agree.

Yours,
Linus Walleij

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ