lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230620145338.1300897-1-dhowells@redhat.com>
Date:   Tue, 20 Jun 2023 15:53:19 +0100
From:   David Howells <dhowells@...hat.com>
To:     netdev@...r.kernel.org
Cc:     David Howells <dhowells@...hat.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Willem de Bruijn <willemdebruijn.kernel@...il.com>,
        David Ahern <dsahern@...nel.org>,
        Matthew Wilcox <willy@...radead.org>,
        Jens Axboe <axboe@...nel.dk>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: [PATCH net-next v3 00/18] splice, net: Switch over users of sendpage() and remove it

Here's the final set of patches towards the removal of sendpage.  All the
drivers that use sendpage() get switched over to using sendmsg() with
MSG_SPLICE_PAGES.

skb_splice_from_iter() is given the facility to copy slab data() into
fragments - or to coalesce them (in future) with other unspliced buffers in
the target skbuff.  This means that the caller can work the same way, no
matter whether MSG_SPLICE_PAGES is supplied or not and no matter if the
protocol just ignores it.  If MSG_SPLICE_PAGES is not supplied or if it is
just ignored, the data will get copied as normal rather than being spliced.

For the moment, skb_splice_from_iter() is equipped with its own fragment
allocator - one that has percpu pages to allocate from to deal with
parallel callers but that can also drop the percpu lock around calling the
page allocator.

The following changes are made:

 (1) Introduce an SMP-safe shared fragment allocator and make
     skb_splice_from_iter() use it.  The allocator is exported so that
     ocfs2 can use it.

     This now doesn't alter the existing page_frag_cache allocator.

 (2) Expose information from the allocator in /proc/.  This is useful for
     debugging it, but could be dropped.

 (3) Make the protocol drivers behave according to MSG_MORE, not
     MSG_SENDPAGE_NOTLAST.  The latter is restricted to turning on MSG_MORE
     in the sendpage() wrappers.

 (4) Make siw, ceph/rds, skb_send_sock, dlm, nvme, smc, ocfs2, drbd and
     iscsi use sendmsg(), not sendpage and make them specify MSG_MORE
     instead of MSG_SENDPAGE_NOTLAST.

     ocfs2 now allocates fragments for a couple of cases where it would
     otherwise pass in a pointer to shared data that doesn't seem to
     sufficient locking.

 (5) Make drbd coalesce its entire message into a single sendmsg().

 (6) Kill off sendpage and clean up MSG_SENDPAGE_NOTLAST.

I've pushed the patches here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=sendpage-3-frag

David

Changes
=======
ver #3)
 - Remove duplicate decl of skb_splice_from_iter().
 - In tcp_bpf, reset msg_flags on each iteration to clear MSG_MORE.
 - In tcp_bpf, set MSG_MORE if there's more data in the sk_msg.
 - Split the nvme patch into host and target patches.

ver #2)
 - Wrapped some lines at 80.
 - Fixed parameter to put_cpu_ptr() to have an '&'.
 - Use "unsigned int" rather than "unsigned".
 - Removed duplicate word in comment.
 - Filled in the commit message on the last patch.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6 # part 1
Link: https://lore.kernel.org/r/20230616161301.622169-1-dhowells@redhat.com/ # v1

David Howells (18):
  net: Copy slab data for sendmsg(MSG_SPLICE_PAGES)
  net: Display info about MSG_SPLICE_PAGES memory handling in proc
  tcp_bpf, smc, tls, espintcp: Reduce MSG_SENDPAGE_NOTLAST usage
  siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit
  ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock()
  ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
  rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  nvme/host: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  nvme/target: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  smc: Drop smc_sendpage() in favour of smc_sendmsg() + MSG_SPLICE_PAGES
  ocfs2: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
  drbd: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
  drdb: Send an entire bio in a single sendmsg
  iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
  net: Kill MSG_SENDPAGE_NOTLAST

 Documentation/bpf/map_sockmap.rst             |  10 +-
 Documentation/filesystems/locking.rst         |   2 -
 Documentation/filesystems/vfs.rst             |   1 -
 Documentation/networking/scaling.rst          |   4 +-
 crypto/af_alg.c                               |  28 --
 crypto/algif_aead.c                           |  22 +-
 crypto/algif_rng.c                            |   2 -
 crypto/algif_skcipher.c                       |  14 -
 drivers/block/drbd/drbd_main.c                |  88 +++---
 drivers/infiniband/sw/siw/siw_qp_tx.c         | 231 +++-------------
 .../chelsio/inline_crypto/chtls/chtls.h       |   2 -
 .../chelsio/inline_crypto/chtls/chtls_io.c    |  14 -
 .../chelsio/inline_crypto/chtls/chtls_main.c  |   1 -
 drivers/nvme/host/tcp.c                       |  46 ++--
 drivers/nvme/target/tcp.c                     |  46 ++--
 drivers/scsi/iscsi_tcp.c                      |  26 +-
 drivers/scsi/iscsi_tcp.h                      |   2 +-
 drivers/target/iscsi/iscsi_target_util.c      |  15 +-
 fs/dlm/lowcomms.c                             |  10 +-
 fs/nfsd/vfs.c                                 |   2 +-
 fs/ocfs2/cluster/tcp.c                        | 109 ++++----
 include/crypto/if_alg.h                       |   2 -
 include/linux/net.h                           |   8 -
 include/linux/skbuff.h                        |   2 +
 include/linux/socket.h                        |   4 +-
 include/net/inet_common.h                     |   2 -
 include/net/sock.h                            |   6 -
 include/net/tcp.h                             |   4 -
 net/appletalk/ddp.c                           |   1 -
 net/atm/pvc.c                                 |   1 -
 net/atm/svc.c                                 |   1 -
 net/ax25/af_ax25.c                            |   1 -
 net/caif/caif_socket.c                        |   2 -
 net/can/bcm.c                                 |   1 -
 net/can/isotp.c                               |   1 -
 net/can/j1939/socket.c                        |   1 -
 net/can/raw.c                                 |   1 -
 net/ceph/messenger_v1.c                       |  58 ++--
 net/ceph/messenger_v2.c                       |  91 ++-----
 net/core/skbuff.c                             | 257 ++++++++++++++++--
 net/core/sock.c                               |  35 +--
 net/dccp/ipv4.c                               |   1 -
 net/dccp/ipv6.c                               |   1 -
 net/ieee802154/socket.c                       |   2 -
 net/ipv4/af_inet.c                            |  21 --
 net/ipv4/tcp.c                                |  43 +--
 net/ipv4/tcp_bpf.c                            |  28 +-
 net/ipv4/tcp_ipv4.c                           |   1 -
 net/ipv4/udp.c                                |  15 -
 net/ipv4/udp_impl.h                           |   2 -
 net/ipv4/udplite.c                            |   1 -
 net/ipv6/af_inet6.c                           |   3 -
 net/ipv6/raw.c                                |   1 -
 net/ipv6/tcp_ipv6.c                           |   1 -
 net/kcm/kcmsock.c                             |  20 --
 net/key/af_key.c                              |   1 -
 net/l2tp/l2tp_ip.c                            |   1 -
 net/l2tp/l2tp_ip6.c                           |   1 -
 net/llc/af_llc.c                              |   1 -
 net/mctp/af_mctp.c                            |   1 -
 net/mptcp/protocol.c                          |   2 -
 net/netlink/af_netlink.c                      |   1 -
 net/netrom/af_netrom.c                        |   1 -
 net/packet/af_packet.c                        |   2 -
 net/phonet/socket.c                           |   2 -
 net/qrtr/af_qrtr.c                            |   1 -
 net/rds/af_rds.c                              |   1 -
 net/rds/tcp_send.c                            |  74 ++---
 net/rose/af_rose.c                            |   1 -
 net/rxrpc/af_rxrpc.c                          |   1 -
 net/sctp/protocol.c                           |   1 -
 net/smc/af_smc.c                              |  29 --
 net/smc/smc_stats.c                           |   2 +-
 net/smc/smc_stats.h                           |   1 -
 net/smc/smc_tx.c                              |  20 +-
 net/smc/smc_tx.h                              |   2 -
 net/socket.c                                  |  48 ----
 net/tipc/socket.c                             |   3 -
 net/tls/tls.h                                 |   6 -
 net/tls/tls_device.c                          |  24 +-
 net/tls/tls_main.c                            |   9 +-
 net/tls/tls_sw.c                              |  37 +--
 net/unix/af_unix.c                            |  19 --
 net/vmw_vsock/af_vsock.c                      |   3 -
 net/x25/af_x25.c                              |   1 -
 net/xdp/xsk.c                                 |   1 -
 net/xfrm/espintcp.c                           |  10 +-
 .../perf/trace/beauty/include/linux/socket.h  |   1 -
 tools/perf/trace/beauty/msg_flags.c           |   3 -
 89 files changed, 544 insertions(+), 1061 deletions(-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ