[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <55e3766e-e292-4136-9e8f-2098ffd53b5d@app.fastmail.com>
Date: Fri, 06 Feb 2026 11:44:04 +0100
From: "Arnd Bergmann" <arnd@...db.de>
To: "Yushan Wang" <wangyushan12@...wei.com>,
"Jonathan Cameron" <jonathan.cameron@...wei.com>,
"Linus Walleij" <linusw@...nel.org>
Cc: "Alexandre Belloni" <alexandre.belloni@...tlin.com>,
"Drew Fustini" <fustini@...nel.org>, "Krzysztof Kozlowski" <krzk@...nel.org>,
"Linus Walleij" <linus.walleij@...aro.org>, "Will Deacon" <will@...nel.org>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
fanghao11@...wei.com, linuxarm@...wei.com, liuyonglong@...wei.com,
prime.zeng@...ilicon.com, "Zhou Wang" <wangzhou1@...ilicon.com>,
"Wei Xu" <xuwei5@...ilicon.com>, linux-mm@...r.kernel.org,
"SeongJae Park" <sj@...nel.org>
Subject: Re: [PATCH 1/3] soc cache: L3 cache driver for HiSilicon SoC
On Fri, Feb 6, 2026, at 11:07, wangyushan wrote:
>
> Let me try to explain the use case here.
>
> The idea is similar to this article:
> https://www.cl.cam.ac.uk/~rnw24/papers/201708-sigcomm-diskcryptnet.pdf
>
> Suppose we have data on SSD that need to be transferred through network.
> We have technologies like DDIO and IO stash to make data flow through
> L3 cache instead of DDR to avoid the influence of DDR bandwidth.
>
> But if something is to be done to the data instead of merely copying,
> and cores needs to participate, we'd like to make data to climb a bit
> higher up through the memory hierarchy and stay there before data
> processing is done. That is, correct amount of data being fetched to
> L3 cache, and consumed just in time, then free L3 for next batch.
> It is more of a userspace defined pipeline that utilizes capability
> provided by kernel, where cache locks are allocated and freed quickly
> with batches.
>
> In above use case, C2C latency is chosen to avoid DDR latency, precisely
> which L3 cache to store the data is not required. (For this part maybe
> including steering tag as the hint to choose the correct L3 is a smarter
> way, like AMD SDCIAE).
>
> Memory management is, in many way, independent to architecture and
> vendors, we might not want to take hardware specific feature into
> account when kernel makes decisions of, say, swapping a page or not,
> but we can control the hardware resource to lean more on a process,
> like resctl.
Ah, so if the main purpose here is to access the memory from
devices, I wonder if this should be structured as a dma-buf
driver. This would still allow you to mmap() a character
device, but in addition allow passing the file descriptor
to driver interfaces that take a dmabuf instead of a user
memory pointer.
Arnd
Powered by blists - more mailing lists