lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izMi-yrCRx=VzhBH100MgxCpmQSNsqOLZ9efV+mFeS_Hnw@mail.gmail.com>
Date: Thu, 10 Oct 2024 17:32:34 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Pavel Begunkov <asml.silence@...il.com>
Cc: David Wei <dw@...idwei.uk>, io-uring@...r.kernel.org, netdev@...r.kernel.org, 
	Jens Axboe <axboe@...nel.dk>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jesper Dangaard Brouer <hawk@...nel.org>, David Ahern <dsahern@...nel.org>
Subject: Re: [PATCH v1 11/15] io_uring/zcrx: implement zerocopy receive pp
 memory provider

On Thu, Oct 10, 2024 at 2:22 PM Pavel Begunkov <asml.silence@...il.com> wrote:
>
> > page_pool. To make matters worse, the bypass is only there if the
> > netmems are returned from io_uring, and not bypassed when the netmems
> > are returned from driver/tcp stack. I'm guessing if you reused the
> > page_pool recycling in the io_uring return path then it would remove
> > the need for your provider to implement its own recycling for the
> > io_uring return case.
> >
> > Is letting providers bypass and override the page_pool's recycling in
> > some code paths OK? IMO, no. A maintainer will make the judgement call
>
> Mina, frankly, that's nonsense. If we extend the same logic,
> devmem overrides page allocation rules with callbacks, devmem
> overrides and violates page pool buffer lifetimes by extending
> it to user space, devmem violates and overrides the page pool
> object lifetime by binding buffers to sockets. And all of it
> I'd rather name extends and enhances to fit in the devmem use
> case.
>
> > and speak authoritatively here and I will follow, but I do think it's
> > a (much) worse design.
>
> Sure, I have a completely opposite opinion, that's a much
> better approach than returning through a syscall, but I will
> agree with you that ultimately the maintainers will say if
> that's acceptable for the networking or not.
>

Right, I'm not suggesting that you return the pages through a syscall.
That will add syscall overhead when it's better not to have that
especially in io_uring context. Devmem TCP needed a syscall because I
couldn't figure out a non-syscall way with sockets for the userspace
to tell the kernel that it's done with some netmems. You do not need
to follow that at all. Sorry if I made it seem like so.

However, I'm suggesting that when io_uring figures out that the
userspace is done with a netmem, that you feed that netmem back to the
pp, and utilize the pp's recycling, rather than adding your own
recycling in the provider.

>From your commit message:

"we extend the lifetime by recycling buffers only after the user space
acknowledges that it's done processing the data via the refill queue"

It seems to me that you get some signal from the userspace that data
is ready to be reuse via that refill queue (whatever it is, very
sorry, I'm not that familiar with io_uring). My suggestion here is
when the userspace tells you that a netmem is ready for reuse (however
it does that), that you feed that page back to the pp via something
like napi_pp_put_page() or page_pool_put_page_bulk() if that makes
sense to you. FWIW I'm trying to look through your code to understand
what that refill queue is and where - if anywhere - it may be possible
to feed pages back to the pp, rather than directly to the provider.

-- 
Thanks,
Mina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ