netdev - Re: [GIT PULL] io_uring support for zerocopy send

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1bbb9374-c503-37c6-45d8-476a8b761d4a@kernel.dk>
Date:   Wed, 3 Aug 2022 10:39:55 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Pavel Begunkov <asml.silence@...il.com>
Cc:     io-uring <io-uring@...r.kernel.org>,
        netdev <netdev@...r.kernel.org>
Subject: Re: [GIT PULL] io_uring support for zerocopy send

On 8/2/22 2:45 PM, Linus Torvalds wrote:
> On Sun, Jul 31, 2022 at 8:03 AM Jens Axboe <axboe@...nel.dk> wrote:
>>
>> On top of the core io_uring changes, this pull request adds support for
>> efficient support for zerocopy sends through io_uring. Both ipv4 and
>> ipv6 is supported, as well as both TCP and UDP.
> 
> I've pulled this, but I would *really* have wanted to see real
> performance numbers from real loads.
> 
> Zero-copy networking has decades of history (and very much not just in
> Linux) of absolutely _wonderful_ benchmark numbers, but less-than
> impressive take-up on real loads.
> 
> A lot of the wonderful benchmark numbers are based on loads that
> carefully don't touch the data on either the sender or receiver side,
> and that get perfect behavior from a performance standpoint as a
> result, but don't actually do anything remotely realistic in the
> process.
> 
> Having data that never resides in the CPU caches, or having mappings
> that are never written to and thus never take page faults are classic
> examples of "look, benchmark numbers!".
> 
> Please?

That's a valid concern! One of the key points behind Pavel's work is
that we wanted to make zerocopy _actually_ work with smaller payloads. A
lot of the past work has been focused on (or only useful with) bigger
payloads, which then almost firmly lands it in the realm of "looks good
on streamed benchmarks". If you look at the numbers Pavel posted, it's
definitely firmly in benchmark land, but I do think the goals of
breaking even with non zero-copy for realistic payload sizes is the real
differentiator here.

For the io_uring network developments, Dylan wrote a benchmark that we
use to mimic things like Thrift. Yes it's a benchmark, but it's meant to
model real world things, not just measure ping-pongs or streamed
bandwidth. It's actually helped drive various of the more recent
features, as well as things coming in the next release, and been very
useful as a research vehicle for adding real io_uring support to Thrift.
The latter is why it was created in the first place, not to have Yet
Another benchmark that can just spew meaningless numbers. Zero-copy is
being added there too, and we just talked about adding some more tweaks
to netbench that allows it to model data/cache usage too on both ends.

The Thrift work is what is really driving this, but it isn't quite done
yet. Looking very promising vs epoll now, though, we'll make some more
noise about this once it lands. Moving to a completion based model takes
a bit of time, it's not a quick hack conversion where you just switch to
a different notification base.

-- 
Jens Axboe