[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izP-6mKM1vEELjRXRj09qwSh_tCDdwA3TWxVuSOYNBGYeA@mail.gmail.com>
Date: Thu, 5 Jun 2025 11:59:24 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: David Howells <dhowells@...hat.com>
Cc: Stanislav Fomichev <stfomichev@...il.com>, willy@...radead.org, hch@...radead.org,
Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>, netdev@...r.kernel.org,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: Device mem changes vs pinning/zerocopy changes
On Wed, Jun 4, 2025 at 7:56 AM David Howells <dhowells@...hat.com> wrote:
>
> Stanislav Fomichev <stfomichev@...il.com> wrote:
>
> > > (1) Separate fragment lifetime management from sk_buff. No more wangling
> > > of refcounts in the skbuff code. If you clone an skb, you stick an
> > > extra ref on the lifetime management struct, not the page.
> >
> > For device memory TCP we already have this: net_devmem_dmabuf_binding
> > is the owner of the frags. And when we reference skb frag we reference
> > only this owner, not individual chunks: __skb_frag_ref -> get_netmem ->
> > net_devmem_get_net_iov (ref on the binding).
> >
> > Will it be possible to generalize this to cover MSG_ZEROCOPY and splice
> > cases? From what I can tell, this is somewhat equivalent of your net_txbuf.
>
> Yes and no. The net_devmem stuff that's now upstream still manages refs on a
> per-skb-frag basis.
Actually Stan may be right here, something similar to the net_devmem
model may be what you want here.
The net_devmem stuff actually never grabs references on the frags
themselves, as Stan explained (which is what you want). We have an
object 'net_devmem_dmabuf_binding', which represents a chunk of pinned
devmem passed from userspace. When the net stack asks for a ref on a
frag, we grab a ref on the binding the frag belongs too in this call
path that Stan pointed to:
__skb_frag_ref -> get_netmem -> net_devmem_get_net_iov (ref on the binding).
This sounds earingly similar to what you want to do. You could have a
new struct (net_zcopy_mem) which represents a chunk of zerocopy memory
that you've pinned using GUP or whatever is the correct api is. Then
when the net stack wants a ref on a frag, you (somehow) figure out
which net_zcopy_mem it belongs to, and you grab a ref on the struct
rather than the frag.
Then when the refcount of net_zcopy_mem hits 0, you know you can
un-GUP the zcopy memory. I think that model in general may work. But
also it may be a case of everything looking like a nail to someone
with a hammer.
Better yet, we already have in the code a struct that represent
zerocopy memory, struct ubuf_info_msgzc. Instead of inventing a new
struct, you can reuse this one to do the memory pinning and
refcounting on behalf of the memory underneath?
--
Thanks,
Mina
Powered by blists - more mailing lists