[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izOL_L3fjB1YutDm8xvp1hboyO9_ng0pOVESUqDew9N96w@mail.gmail.com>
Date: Mon, 12 May 2025 12:10:05 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Byungchul Park <byungchul@...com>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, kernel_team@...ynix.com, kuba@...nel.org,
ilias.apalodimas@...aro.org, harry.yoo@...cle.com, hawk@...nel.org,
akpm@...ux-foundation.org, ast@...nel.org, daniel@...earbox.net,
davem@...emloft.net, john.fastabend@...il.com, andrew+netdev@...n.ch,
edumazet@...gle.com, pabeni@...hat.com, vishal.moola@...il.com
Subject: Re: [RFC 19/19] mm, netmem: remove the page pool members in struct page
On Fri, May 9, 2025 at 12:48 PM Matthew Wilcox <willy@...radead.org> wrote:
>
> On Fri, May 09, 2025 at 12:04:37PM -0700, Mina Almasry wrote:
> > Right, all I'm saying is that if it's at all possible to keep net_iov
> > something that can be extended with fields unrelated to struct page,
> > lets do that. net_iov already has fields that should not belong in
> > struct page like net_iov_owner and I think more will be added.
>
> Sure, that's fine.
>
Excellent!
> > I'm thinking netmem_desc can be the fields that are shared between
> > struct net_iov and struct page (but both can have more specific to the
> > different memory types). As you say, for now netmem_desc can currently
> > overlap fields in struct page and struct net_iov, and a follow up
> > change can replace it with something that gets kmalloced and (I
> > guess?) there is a pointer in struct page or struct net_iov that
> > refers to the netmem_desc that contains the shared fields.
>
> I'm sure I've pointed you at
> https://kernelnewbies.org/MatthewWilcox/Memdescs before.
>
I've gone through that again. Some of it is a bit over my head
(sorry), but this page does say that page->compound_head will have a
link to memdesc:
https://kernelnewbies.org/MatthewWilcox/Memdescs/Path
That's an approach that sounds fine to me. We can have net_iov follow
that pattern if necessary (have in it a field that points to the
memdesc).
> But I wouldn't expect to have net_iov contain a pointer to netmem_desc,
> rather it would embed a netmem_desc. Unless there's a good reason to
> separate them.
>
net_iov embedding netmem_desc sounds fine as well to me.
> Actually, I'd hope to do away with net_iov entirely. Networking should
> handle memory-on-PCI-devices the same way everybody else does (as
> hotplugged memory) rather than with its own special structures.
>
Doing away with net_iov entirely is a different conversation. From the
devmem TCP side, you're much more of an expert than me but my
experience is that the GPU devices we initially net_iovs for, dmabuf
is the standard way of sharing memory, and the dma-buf importer just
gets a scatterlist, and has to either work with the scatterlist
directly or create descriptors (like net_iov) to handle chunks of the
scatterlist.
I think we discussed this before and you said to me you have long term
plans to get rid of scatterlists. Once that is done we may be able to
do away with the dma-buf use case for net_iovs, but the conversation
about migrating scatterlists to something new is well over my head and
probably needs discussion with the dma-buf maintainers.
Note also that the users of net_iov have expanded and io_uring has a
dependency on it as well.
The good news (I think) is that Byungchul's effort does not require
the removal of net_iov. From looking at this patchset I think what
he's trying to do is very compatible with net_iovs with minor
modifications.
--
Thanks,
Mina
Powered by blists - more mailing lists