lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bbf9cb47-d022-4711-8d73-b035275519a7@gmail.com>
Date: Mon, 15 Apr 2024 01:08:55 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: David Ahern <dsahern@...nel.org>, io-uring@...r.kernel.org,
 netdev@...r.kernel.org
Cc: Jens Axboe <axboe@...nel.dk>, "David S . Miller" <davem@...emloft.net>,
 Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Subject: Re: [RFC 0/6] implement io_uring notification (ubuf_info) stacking

On 4/13/24 18:17, David Ahern wrote:
> On 4/12/24 6:55 AM, Pavel Begunkov wrote:
>> io_uring allocates a ubuf_info per zerocopy send request, it's convenient
>> for the userspace but with how things are it means that every time the
>> TCP stack has to allocate a new skb instead of amending into a previous
>> one. Unless sends are big enough, there will be lots of small skbs
>> straining the stack and dipping performance.
> 
> The ubuf_info forces TCP segmentation at less than MTU boundaries which
> kills performance with small message sizes as TCP is forced to send
> small packets. This is an interesting solution to allow the byte stream
> to flow yet maintain the segmentation boundaries for callbacks.

Thanks, I'll add your reviews if the patches survive in the
current form!


>> The patchset implements notification, i.e. an io_uring's ubuf_info
>> extension, stacking. It tries to link ubuf_info's into a list, and
>> the entire link will be put down together once all references are
>> gone.
>>
>> Testing with liburing/examples/send-zerocopy and another custom made
>> tool, with 4K bytes per send it improves performance ~6 times and
>> levels it with MSG_ZEROCOPY. Without the patchset it requires much
>> larger sends to utilise all potential.
>>
>> bytes  | before | after (Kqps)
>> 100    | 283    | 936
>> 1200   | 195    | 1023
>> 4000   | 193    | 1386
>> 8000   | 154    | 1058
>>
>> Pavel Begunkov (6):
>>    net: extend ubuf_info callback to ops structure
>>    net: add callback for setting a ubuf_info to skb
>>    io_uring/notif: refactor io_tx_ubuf_complete()
>>    io_uring/notif: remove ctx var from io_notif_tw_complete
>>    io_uring/notif: simplify io_notif_flush()
>>    io_uring/notif: implement notification stacking
>>
>>   drivers/net/tap.c      |  2 +-
>>   drivers/net/tun.c      |  2 +-
>>   drivers/vhost/net.c    |  8 +++-
>>   include/linux/skbuff.h | 21 ++++++----
>>   io_uring/notif.c       | 91 +++++++++++++++++++++++++++++++++++-------
>>   io_uring/notif.h       | 13 +++---
>>   net/core/skbuff.c      | 37 +++++++++++------
>>   7 files changed, 129 insertions(+), 45 deletions(-)
>>
> 

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ