netdev - Re: [net-next v3 0/8] Perf. optimizations for TCP Recv. Zerocopy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20201204143816.7e59877c@kicinski-fedora-pc1c0hjn.DHCP.thefacebook.com>
Date:   Fri, 4 Dec 2020 14:38:21 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Arjun Roy <arjunroy.kdev@...il.com>
Cc:     davem@...emloft.net, netdev@...r.kernel.org, arjunroy@...gle.com,
        edumazet@...gle.com, soheil@...gle.com
Subject: Re: [net-next v3 0/8] Perf. optimizations for TCP Recv. Zerocopy

On Wed,  2 Dec 2020 14:53:41 -0800 Arjun Roy wrote:
> Summarized:
> 1. It is possible that a read payload is not exactly page aligned -
> that there may exist "straggler" bytes that we cannot map into the
> caller's address space cleanly. For this, we allow the caller to
> provide as argument a "hybrid copy buffer", turning
> getsockopt(TCP_ZEROCOPY_RECEIVE) into a "hybrid" operation that allows
> the caller to avoid a subsequent recvmsg() call to read the
> stragglers.
> 
> 2. Similarly, for "small" read payloads that are either below the size
> of a page, or small enough that remapping pages is not a performance
> win - we allow the user to short-circuit the remapping operations
> entirely and simply copy into the buffer provided.
> 
> Some of the patches in the middle of this set are refactors to support
> this "short-circuiting" optimization.
> 
> 3. We allow the user to provide a hint that performing a page zap
> operation (and the accompanying TLB shootdown) may not be necessary,
> for the provided region that the kernel will attempt to map pages
> into. This allows us to avoid this expensive operation while holding
> the socket lock, which provides a significant performance advantage.
> 
> With all of these changes combined, "medium" sized receive traffic
> (multiple tens to few hundreds of KB) see significant efficiency gains
> when using TCP receive zerocopy instead of regular recvmsg(). For
> example, with RPC-style traffic with 32KB messages, there is a roughly
> 15% efficiency improvement when using zerocopy. Without these changes,
> there is a roughly 60-70% efficiency reduction with such messages when
> employing zerocopy.

Applied, thank you!