[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izOF5dM7WUrzDhGrR_UP7t_Mg7=sgti_TSbqG4x00UBfXA@mail.gmail.com>
Date: Wed, 9 Oct 2024 12:32:37 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Jens Axboe <axboe@...nel.dk>
Cc: David Wei <dw@...idwei.uk>, io-uring@...r.kernel.org, netdev@...r.kernel.org,
Pavel Begunkov <asml.silence@...il.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jesper Dangaard Brouer <hawk@...nel.org>, David Ahern <dsahern@...nel.org>
Subject: Re: [PATCH v1 00/15] io_uring zero copy rx
On Wed, Oct 9, 2024 at 9:57 AM Jens Axboe <axboe@...nel.dk> wrote:
>
> On 10/9/24 10:55 AM, Mina Almasry wrote:
> > On Mon, Oct 7, 2024 at 3:16?PM David Wei <dw@...idwei.uk> wrote:
> >>
> >> This patchset adds support for zero copy rx into userspace pages using
> >> io_uring, eliminating a kernel to user copy.
> >>
> >> We configure a page pool that a driver uses to fill a hw rx queue to
> >> hand out user pages instead of kernel pages. Any data that ends up
> >> hitting this hw rx queue will thus be dma'd into userspace memory
> >> directly, without needing to be bounced through kernel memory. 'Reading'
> >> data out of a socket instead becomes a _notification_ mechanism, where
> >> the kernel tells userspace where the data is. The overall approach is
> >> similar to the devmem TCP proposal.
> >>
> >> This relies on hw header/data split, flow steering and RSS to ensure
> >> packet headers remain in kernel memory and only desired flows hit a hw
> >> rx queue configured for zero copy. Configuring this is outside of the
> >> scope of this patchset.
> >>
> >> We share netdev core infra with devmem TCP. The main difference is that
> >> io_uring is used for the uAPI and the lifetime of all objects are bound
> >> to an io_uring instance.
> >
> > I've been thinking about this a bit, and I hope this feedback isn't
> > too late, but I think your work may be useful for users not using
> > io_uring. I.e. zero copy to host memory that is not dependent on page
> > aligned MSS sizing. I.e. AF_XDP zerocopy but using the TCP stack.
>
> Not David, but come on, let's please get this moving forward. It's been
> stuck behind dependencies for seemingly forever, which are finally
> resolved.
Part of the reason this has been stuck behind dependencies for so long
is because the dependency took the time to implement things very
generically (memory providers, net_iovs) and provided you with the
primitives that enable your work. And dealt with nacks in this area
you now don't have to deal with.
> I don't think this is a reasonable ask at all for this
> patchset. If you want to work on that after the fact, then that's
> certainly an option.
I think this work is extensible to sockets and the implementation need
not be heavily tied to io_uring; yes at least leaving things open for
a socket extension to be done easier in the future would be good, IMO.
I'll look at the series more closely to see if I actually have any
concrete feedback along these lines. I hope you're open to some of it
:-)
--
Thanks,
Mina
Powered by blists - more mailing lists