lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4c5eaf8f-2433-4971-b5d0-4f35acb2820e@huawei.com>
Date: Fri, 6 Feb 2026 18:07:51 +0800
From: wangyushan <wangyushan12@...wei.com>
To: Jonathan Cameron <jonathan.cameron@...wei.com>, Linus Walleij
	<linusw@...nel.org>
CC: <alexandre.belloni@...tlin.com>, <arnd@...db.de>, <fustini@...nel.org>,
	<krzk@...nel.org>, <linus.walleij@...aro.org>, <will@...nel.org>,
	<linux-arm-kernel@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
	<fanghao11@...wei.com>, <linuxarm@...wei.com>, <liuyonglong@...wei.com>,
	<prime.zeng@...ilicon.com>, <wangzhou1@...ilicon.com>,
	<xuwei5@...ilicon.com>, <linux-mm@...r.kernel.org>, SeongJae Park
	<sj@...nel.org>, Yushan Wang <wangyushan12@...wei.com>
Subject: Re: [PATCH 1/3] soc cache: L3 cache driver for HiSilicon SoC

On 2/4/2026 9:40 PM, Jonathan Cameron wrote:
> On Wed, 4 Feb 2026 01:10:01 +0100
> Linus Walleij <linusw@...nel.org> wrote:
>
>> Shouldn't the MM subsystem be in charge of determining, locking
>> down and freeing up hot regions in L3 cache?
>>
>> This looks more like userspace is going to determine that but
>> how exactly? By running DAMON? Then it's better to keep the
>> whole mechanism in the kernel where it belongs and let the
>> MM subsystem adapt locked L3 cache to the usage patterns.
> I haven't yet come up with any plausible scheme by which the MM
> subsystem could do this.
>
> I think what we need here Yushan, is more detail on end to end
> use cases for this.  Some examples etc as clearer motivation.
>

Hi,

Let me try to explain the use case here.

The idea is similar to this article:
https://www.cl.cam.ac.uk/~rnw24/papers/201708-sigcomm-diskcryptnet.pdf

Suppose we have data on SSD that need to be transferred through network.
We have technologies like DDIO and IO stash to make data flow through
L3 cache instead of DDR to avoid the influence of DDR bandwidth.

But if something is to be done to the data instead of merely copying,
and cores needs to participate, we'd like to make data to climb a bit
higher up through the memory hierarchy and stay there before data
processing is done. That is, correct amount of data being fetched to
L3 cache, and consumed just in time, then free L3 for next batch.
It is more of a userspace defined pipeline that utilizes capability
provided by kernel, where cache locks are allocated and freed quickly
with batches.

In above use case, C2C latency is chosen to avoid DDR latency, precisely
which L3 cache to store the data is not required. (For this part maybe
including steering tag as the hint to choose the correct L3 is a smarter
way, like AMD SDCIAE).

Memory management is, in many way, independent to architecture and
vendors, we might not want to take hardware specific feature into
account when kernel makes decisions of, say, swapping a page or not,
but we can control the hardware resource to lean more on a process,
like resctl.

Thanks,
Yushan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ