lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <408e230c-d098-4ecf-a5b1-1fae9daadb93@gmail.com>
Date: Wed, 19 Feb 2025 12:11:40 +0000
From: Pavel Begunkov <asml.silence@...il.com>
To: Jinjie Ruan <ruanjinjie@...wei.com>, io-uring@...r.kernel.org,
 netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Cc: "David S . Miller" <davem@...emloft.net>, Jakub Kicinski
 <kuba@...nel.org>, Jonathan Lemon <jonathan.lemon@...il.com>,
 Willem de Bruijn <willemb@...gle.com>, Jens Axboe <axboe@...nel.dk>,
 David Ahern <dsahern@...nel.org>, kernel-team@...com
Subject: Re: [PATCH net-next v5 00/27] io_uring zerocopy send

On 2/18/25 01:47, Jinjie Ruan wrote:
> On 2022/7/13 4:52, Pavel Begunkov wrote:
>> NOTE: Not to be picked directly. After getting necessary acks, I'll be
>>        working out merging with Jakub and Jens.
>>
>> The patchset implements io_uring zerocopy send. It works with both registered
>> and normal buffers, mixing is allowed but not recommended. Apart from usual
>> request completions, just as with MSG_ZEROCOPY, io_uring separately notifies
>> the userspace when buffers are freed and can be reused (see API design below),
>> which is delivered into io_uring's Completion Queue. Those "buffer-free"
>> notifications are not necessarily per request, but the userspace has control
>> over it and should explicitly attaching a number of requests to a single
>> notification. The series also adds some internal optimisations when used with
>> registered buffers like removing page referencing.
>>
>> >From the kernel networking perspective there are two main changes. The first
>> one is passing ubuf_info into the network layer from io_uring (inside of an
>> in kernel struct msghdr). This allows extra optimisations, e.g. ubuf_info
>> caching on the io_uring side, but also helps to avoid cross-referencing
>> and synchronisation problems. The second part is an optional optimisation
>> removing page referencing for requests with registered buffers.
>>
>> Benchmarking UDP with an optimised version of the selftest (see [1]), which
> 
> Hi, Pavel, I'm interested in zero copy sending of io_uring, but I can't
> reproduce its performance using zerocopy send selftest test case, such
> as "bash io_uring_zerocopy_tx.sh 6 udp -m 0/1/2/3 -n 64", even baseline
> performance may be the best.
> 
>                 MB/s
> NONZC         8379
> ZC            5910
> ZC_FIXED      6294
> MIXED         6350

It's using veth, and zerocopy is effectively disabled for most of
virtual devices, or to be specific "for paths that may loop packets
to receive sockets".

https://lore.kernel.org/netdev/20170803202945.70750-6-willemdebruijn.kernel@gmail.com/

So that's the worst of the two, it copies data but also incurs the
overhead for notifications. You can use a dummy device as a sink with
no receiver, but you'll get more realistic numbers if you use a real
device (that supports features required for zerocopy).

> And the zero-copy example in [1] does not seem to work because the
> kernel is modified by following commit:
> 
> https://lore.kernel.org/all/cover.1662027856.git.asml.silence@gmail.com/

The right version was merged long ago and sits in

liburing/examples/send-zerocopy.c

It's brushed up more than the selftest version, so I'd suggest using
that one. Arguments are a bit different, but it prints help.

./send-zerocopy -6 udp -D <ip> -t 10 -n 1 -l0 -b1 -d -z1

-- 
Pavel Begunkov


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ