lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 29 Oct 2021 22:05:40 -0400
From:   Talal Ahmad <mailtalalahmad@...il.com>
To:     davem@...emloft.net, netdev@...r.kernel.org
Cc:     arjunroy@...gle.com, edumazet@...gle.com, soheil@...gle.com,
        willemb@...gle.com, dsahern@...nel.org, yoshfuji@...ux-ipv6.org,
        kuba@...nel.org, cong.wang@...edance.com, haokexin@...il.com,
        jonathan.lemon@...il.com, alobakin@...me, pabeni@...hat.com,
        ilias.apalodimas@...aro.org, memxor@...il.com, elver@...gle.com,
        nogikh@...gle.com, vvs@...tuozzo.com,
        Talal Ahmad <talalahmad@...gle.com>
Subject: [PATCH net-next v2 0/2] Accurate Memory Charging For MSG_ZEROCOPY

From: Talal Ahmad <talalahmad@...gle.com>

This series improves the accuracy of msg_zerocopy memory accounting.
At present, when msg_zerocopy is used memory is charged twice for the
data - once when user space allocates it, and then again within
__zerocopy_sg_from_iter. The memory charging in the kernel is excessive
because data is held in user pages and is never actually copied to skb
fragments. This leads to incorrectly inflated memory statistics for
programs passing MSG_ZEROCOPY.

We reduce this inaccuracy by introducing the notion of "pure" zerocopy
SKBs - where all the frags in the SKB are backed by pinned userspace
pages, and none are backed by copied pages. For such SKBs, tracked via
the new SKBFL_PURE_ZEROCOPY flag, we elide sk_mem_charge/uncharge
calls, leading to more accurate accounting.

However, SKBs can also be coalesced by the stack at present,
potentially leading to "impure" SKBs. We restrict this coalescing so
it can only happen within the sendmsg() system call itself, for the
most recently allocated SKB. While this can lead to a small degree of
double-charging of memory, this case does not arise often in practice
for workloads that set MSG_ZEROCOPY.

Testing verified that memory usage in the kernel is lowered.
Instrumentation with counters also showed that accounting at time
charging and uncharging is balanced.

Talal Ahmad (2):
  tcp: rename sk_wmem_free_skb
  net: avoid double accounting for pure zerocopy skbs

 include/linux/skbuff.h | 19 ++++++++++++++++++-
 include/net/sock.h     |  7 -------
 include/net/tcp.h      | 15 +++++++++++++--
 net/core/datagram.c    |  3 ++-
 net/core/skbuff.c      |  3 ++-
 net/ipv4/tcp.c         | 28 +++++++++++++++++++++++-----
 net/ipv4/tcp_output.c  |  9 ++++++---
 7 files changed, 64 insertions(+), 20 deletions(-)

-- 
2.33.1.1089.g2158813163f-goog

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ