netdev - Re: make sendmsg/recvmsg process multiple messages at once

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 02 Feb 2021 11:18:38 +0100
From:   Paolo Abeni <pabeni@...hat.com>
To:     Jakub Kicinski <kuba@...nel.org>,
        Menglong Dong <menglong8.dong@...il.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc:     netdev <netdev@...r.kernel.org>
Subject: Re: make sendmsg/recvmsg process multiple messages at once

On Mon, 2021-02-01 at 20:07 -0800, Jakub Kicinski wrote:
> On Mon, 1 Feb 2021 20:41:45 +0800 Menglong Dong wrote:
> > I am thinking about making sendmsg/recvmsg process multiple messages
> > at once, which is possible to reduce the number of system calls.
> > 
> > Take the receiving of udp as an example, we can copy multiple skbs to
> > msg_iov and make sure that every iovec contains a udp package.
> > 
> > Is this a good idea? This idea seems clumsy compared to the incoming
> > 'io-uring' based zerocopy, but maybe it can help...

Indeed since the introduction of some security vulnerability
mitigation, syscall overhead is relevant and amortizing it with bulk
operations gives very measurable performances gain.

Potentially bulk operation also reduce RETPOLINE overhead, but AFAICS
all the indirect calls in the relevant code path has been already
mitigated with the indirect call wrappers.

Note that you can already process several packets with a single syscall
using sendmmsg/recvmmsg. Both have issues with error reporting and
timeout and IIRC still don't amortize the overhead introduced e.g. by
CONFIG_HARDENED_USERCOPY.

Additionally, recvmmsg/sendmmsg are not cache-friendly. As noted by
Eric long time ago:

https://marc.info/?l=linux-netdev&m=148010858826712&w=2

perf tests in lab with recvmmsg/sendmmsg could be great, but
performance with real workload much less. You could try fine-tuning the
bulk size (mmsg nr) for your workload and H/W. Likely a burst size
above 8 is a no go.

For the TX path there is already a better option - for some specific
workload - using UDP_SEGMENT.

In the RX path, for bulk transfer, you could try enabling UDP_GRO.

As far as I can see, the idea you are proposing will be quite
alike recvmmsg(), with the possible additional benefit of bulk dequeue
from the UDP receive queue. Note that this latter optimization, since
commmit 2276f58ac5890, will give very little perfomance gain.

In the TX path there is no lock at all for the uncorking case, so the
performance gain should come only from the bulk syscall.

You will probably also need to cope with cmsg and msgname, so overall I
don't see much differences from recvmmsg()/sendmmsg(), did I misread
something?

Thanks!

Paolo