lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201111005922.h55aiqcs325bvhk7@bsd-mbp.dhcp.thefacebook.com>
Date:   Tue, 10 Nov 2020 16:59:22 -0800
From:   Jonathan Lemon <jonathan.lemon@...il.com>
To:     Victor Stewart <v@...etag.social>
Cc:     netdev@...r.kernel.org
Subject: Re: MSG_ZEROCOPY_FIXED

On Wed, Nov 11, 2020 at 12:20:22AM +0000, Victor Stewart wrote:
> On Wed, Nov 11, 2020 at 12:09 AM Jonathan Lemon
> <jonathan.lemon@...il.com> wrote:
> >
> > On Sun, Nov 08, 2020 at 05:04:41PM +0000, Victor Stewart wrote:
> > > hi all,
> > >
> > > i'm seeking input / comment on the idea of implementing full fledged
> > > zerocopy UDP networking that uses persistent buffers allocated in
> > > userspace... before I go off on a solo tangent with my first patches
> > > lol.
> > >
> > > i'm sure there's been lots of thought/discussion on this before. of
> > > course Willem added MSG_ZEROCOPY on the send path (pin buffers on
> > > demand / per send). and something similar to what I speak of exists
> > > with TCP_ZEROCOPY_RECEIVE.
> > >
> > > i envision something like a new flag like MSG_ZEROCOPY_FIXED that
> > > "does the right thing" in the send vs recv paths.
> >
> > See the netgpu patches that I posted earlier; these will handle
> > protocol independent zerocopy sends/receives.  I do have a working
> > UDP receive implementation which will be posted with an updated
> > patchset.
> 
> amazing i'll check it out. thanks.
> 
> does your udp zerocopy receive use mmap-ed buffers then vm_insert_pfn
> / remap_pfn_range to remap the physical pages of the received payload
> into the memory submitted by recvmsg for reception?

The application mmaps buffers, which are then pinned into the kernel.
The NIC receives directly into the buffers and then notifies the application.

For completions, the mechanism that I prefer is having one of the
sends tagged with SO_NOTIFY message.  Then a completion notification is 
generated when the buffer corresponding to the NOTIFY is released by
the protocol stack.

The notifiations could be posted as an io_uring CQE.  (work TBD)

> https://lore.kernel.org/io-uring/acc66238-0d27-cd22-dac4-928777a8efbc@gmail.com/T/#t
> 
> ^^ and check the thread from today on the io_uring mailing list going
> into the mechanics of zerocopy sendmsg i have in mind.
> 
> (TLDR; i think it should be io_uring "only" so that we can collapse it
> into a single completion event, aka when the NIC ACKs the
> transmission. and exploiting the asynchrony of io_uring is the only
> way to do this? so you'd submit your sendmsg operation to io_uring and
> instead of receiving a completion event when the send gets enqueued,
> you'd only get it upon failure or NIC ACK).

I think it's likely better to have two completions:
  "this buffer has been submitted", and 
  "this buffer is released by the protocol".

This simplifies handling of errors, cancellations, and short writes.
-- 
Jonathan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ