[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00c67cf0-2bf3-4eaf-b200-ffe00d91593b@gmail.com>
Date: Mon, 10 Jun 2024 20:20:08 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: David Ahern <dsahern@...nel.org>, Jason Gunthorpe <jgg@...pe.ca>
Cc: David Wei <dw@...idwei.uk>, Mina Almasry <almasrymina@...gle.com>,
Christoph Hellwig <hch@...radead.org>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
linux-alpha@...r.kernel.org, linux-mips@...r.kernel.org,
linux-parisc@...r.kernel.org, sparclinux@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
bpf@...r.kernel.org, linux-kselftest@...r.kernel.org,
linux-media@...r.kernel.org, dri-devel@...ts.freedesktop.org,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Donald Hunter <donald.hunter@...il.com>, Jonathan Corbet <corbet@....net>,
Richard Henderson <richard.henderson@...aro.org>,
Ivan Kokshaysky <ink@...assic.park.msu.ru>, Matt Turner
<mattst88@...il.com>, Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
Helge Deller <deller@....de>, Andreas Larsson <andreas@...sler.com>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu
<mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Arnd Bergmann <arnd@...db.de>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman
<eddyz87@...il.com>, Song Liu <song@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>,
John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>, Hao Luo <haoluo@...gle.com>,
Jiri Olsa <jolsa@...nel.org>, Steffen Klassert
<steffen.klassert@...unet.com>, Herbert Xu <herbert@...dor.apana.org.au>,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Shuah Khan <shuah@...nel.org>, Sumit Semwal <sumit.semwal@...aro.org>,
Christian König <christian.koenig@....com>,
Yunsheng Lin <linyunsheng@...wei.com>, Shailend Chand <shailend@...gle.com>,
Harshitha Ramamurthy <hramamurthy@...gle.com>,
Shakeel Butt <shakeel.butt@...ux.dev>, Jeroen de Borst
<jeroendb@...gle.com>, Praveen Kaligineedi <pkaligineedi@...gle.com>
Subject: Re: [PATCH net-next v10 02/14] net: page_pool: create hooks for
custom page providers
On 6/10/24 16:16, David Ahern wrote:
> On 6/10/24 6:16 AM, Jason Gunthorpe wrote:
>> On Mon, Jun 10, 2024 at 02:07:01AM +0100, Pavel Begunkov wrote:
>>> On 6/10/24 01:37, David Wei wrote:
>>>> On 2024-06-07 17:52, Jason Gunthorpe wrote:
>>>>> IMHO it seems to compose poorly if you can only use the io_uring
>>>>> lifecycle model with io_uring registered memory, and not with DMABUF
>>>>> memory registered through Mina's mechanism.
>>>>
>>>> By this, do you mean io_uring must be exclusively used to use this
>>>> feature?
>>>>
>>>> And you'd rather see the two decoupled, so userspace can register w/ say
>>>> dmabuf then pass it to io_uring?
>>>
>>> Personally, I have no clue what Jason means. You can just as
>>> well say that it's poorly composable that write(2) to a disk
>>> cannot post a completion into a XDP ring, or a netlink socket,
>>> or io_uring's main completion queue, or name any other API.
>>
>> There is no reason you shouldn't be able to use your fast io_uring
>> completion and lifecycle flow with DMABUF backed memory. Those are not
>> widly different things and there is good reason they should work
>> together.
Let's not mix up devmem TCP and dmabuf specifically, as I see it
your question was concerning the latter: "... DMABUF memory registered
through Mina's mechanism". io_uring's zcrx can trivially get dmabuf
support in future, as mentioned it's mostly the setup side. ABI,
buffer workflow and some details is a separate issue, and I don't
see how further integration aside from what we're already sharing
is beneficial, on opposite it'll complicate things.
>> Pretending they are totally different just because two different
>> people wrote them is a very siloed view.
io_uring zcrx and devmem? They are not, nobody is saying otherwise,
_very_ similar approaches if anything but with different API, which
is the reason we already use common infra.
>>> The devmem TCP callback can implement it in a way feasible to
>>> the project, but it cannot directly post events to an unrelated
>>> API like io_uring. And devmem attaches buffers to a socket,
>>> for which a ring for returning buffers might even be a nuisance.
>>
>> If you can't compose your io_uring completion mechanism with a DMABUF
>> provided backing store then I think it needs more work.
As per above, it conflates devmem TCP with dmabuf.
> exactly. io_uring, page_pool, dmabuf - all kernel building blocks for
> solutions. This why I was pushing for Mina's set not to be using the
> name `devmem` - it is but one type of memory and with dmabuf it should
> not matter if it is gpu or host (or something else later on - cxl?).
--
Pavel Begunkov
Powered by blists - more mailing lists