lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 13 Dec 2022 06:10:38 -0500 From: Benjamin Coddington <bcodding@...hat.com> To: Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com> Cc: Guillaume Nault <gnault@...hat.com>, Philipp Reisner <philipp.reisner@...bit.com>, Lars Ellenberg <lars.ellenberg@...bit.com>, Christoph Böhmwalder <christoph.boehmwalder@...bit.com>, Jens Axboe <axboe@...nel.dk>, Josef Bacik <josef@...icpanda.com>, Keith Busch <kbusch@...nel.org>, Christoph Hellwig <hch@....de>, Sagi Grimberg <sagi@...mberg.me>, Lee Duncan <lduncan@...e.com>, Chris Leech <cleech@...hat.com>, Mike Christie <michael.christie@...cle.com>, "James E.J. Bottomley" <jejb@...ux.ibm.com>, "Martin K. Petersen" <martin.petersen@...cle.com>, Valentina Manea <valentina.manea.m@...il.com>, Shuah Khan <shuah@...nel.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, David Howells <dhowells@...hat.com>, Marc Dionne <marc.dionne@...istor.com>, Steve French <sfrench@...ba.org>, Christine Caulfield <ccaulfie@...hat.com>, David Teigland <teigland@...hat.com>, Mark Fasheh <mark@...heh.com>, Joel Becker <jlbec@...lplan.org>, Joseph Qi <joseph.qi@...ux.alibaba.com>, Eric Van Hensbergen <ericvh@...il.com>, Latchesar Ionkov <lucho@...kov.net>, Dominique Martinet <asmadeus@...ewreck.org>, Ilya Dryomov <idryomov@...il.com>, Xiubo Li <xiubli@...hat.com>, Chuck Lever <chuck.lever@...cle.com>, Jeff Layton <jlayton@...nel.org>, Trond Myklebust <trond.myklebust@...merspace.com>, Anna Schumaker <anna@...nel.org>, Steffen Klassert <steffen.klassert@...unet.com>, Herbert Xu <herbert@...dor.apana.org.au>, netdev@...r.kernel.org Subject: [PATCH net v3 2/3] Treewide: Stop corrupting socket's task_frag Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the GFP_NOIO flag on sk_allocation which the networking system uses to decide when it is safe to use current->task_frag. The results of this are unexpected corruption in task_frag when SUNRPC is involved in memory reclaim. The corruption can be seen in crashes, but the root cause is often difficult to ascertain as a crashing machine's stack trace will have no evidence of being near NFS or SUNRPC code. I believe this problem to be much more pervasive than reports to the community may indicate. Fix this by having kernel users of sockets that may corrupt task_frag due to reclaim set sk_use_task_frag = false. Preemptively correcting this situation for users that still set sk_allocation allows them to convert to memalloc_nofs_save/restore without the same unexpected corruptions that are sure to follow, unlikely to show up in testing, and difficult to bisect. CC: Philipp Reisner <philipp.reisner@...bit.com> CC: Lars Ellenberg <lars.ellenberg@...bit.com> CC: "Christoph Böhmwalder" <christoph.boehmwalder@...bit.com> CC: Jens Axboe <axboe@...nel.dk> CC: Josef Bacik <josef@...icpanda.com> CC: Keith Busch <kbusch@...nel.org> CC: Christoph Hellwig <hch@....de> CC: Sagi Grimberg <sagi@...mberg.me> CC: Lee Duncan <lduncan@...e.com> CC: Chris Leech <cleech@...hat.com> CC: Mike Christie <michael.christie@...cle.com> CC: "James E.J. Bottomley" <jejb@...ux.ibm.com> CC: "Martin K. Petersen" <martin.petersen@...cle.com> CC: Valentina Manea <valentina.manea.m@...il.com> CC: Shuah Khan <shuah@...nel.org> CC: Greg Kroah-Hartman <gregkh@...uxfoundation.org> CC: David Howells <dhowells@...hat.com> CC: Marc Dionne <marc.dionne@...istor.com> CC: Steve French <sfrench@...ba.org> CC: Christine Caulfield <ccaulfie@...hat.com> CC: David Teigland <teigland@...hat.com> CC: Mark Fasheh <mark@...heh.com> CC: Joel Becker <jlbec@...lplan.org> CC: Joseph Qi <joseph.qi@...ux.alibaba.com> CC: Eric Van Hensbergen <ericvh@...il.com> CC: Latchesar Ionkov <lucho@...kov.net> CC: Dominique Martinet <asmadeus@...ewreck.org> CC: "David S. Miller" <davem@...emloft.net> CC: Eric Dumazet <edumazet@...gle.com> CC: Jakub Kicinski <kuba@...nel.org> CC: Paolo Abeni <pabeni@...hat.com> CC: Ilya Dryomov <idryomov@...il.com> CC: Xiubo Li <xiubli@...hat.com> CC: Chuck Lever <chuck.lever@...cle.com> CC: Jeff Layton <jlayton@...nel.org> CC: Trond Myklebust <trond.myklebust@...merspace.com> CC: Anna Schumaker <anna@...nel.org> CC: Steffen Klassert <steffen.klassert@...unet.com> CC: Herbert Xu <herbert@...dor.apana.org.au> CC: netdev@...r.kernel.org Suggested-by: Guillaume Nault <gnault@...hat.com> Signed-off-by: Benjamin Coddington <bcodding@...hat.com> Reviewed-by: Guillaume Nault <gnault@...hat.com> --- drivers/block/drbd/drbd_receiver.c | 3 +++ drivers/block/nbd.c | 1 + drivers/nvme/host/tcp.c | 1 + drivers/scsi/iscsi_tcp.c | 1 + drivers/usb/usbip/usbip_common.c | 1 + fs/afs/rxrpc.c | 1 + fs/cifs/connect.c | 1 + fs/dlm/lowcomms.c | 2 ++ fs/ocfs2/cluster/tcp.c | 1 + net/9p/trans_fd.c | 1 + net/ceph/messenger.c | 1 + net/sunrpc/xprtsock.c | 3 +++ net/xfrm/espintcp.c | 1 + 13 files changed, 18 insertions(+) diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index ee69d50ba4fd..0d3f910ae347 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -1030,6 +1030,9 @@ static int conn_connect(struct drbd_connection *connection) sock.socket->sk->sk_allocation = GFP_NOIO; msock.socket->sk->sk_allocation = GFP_NOIO; + sock.socket->sk->sk_use_task_frag = false; + msock.socket->sk->sk_use_task_frag = false; + sock.socket->sk->sk_priority = TC_PRIO_INTERACTIVE_BULK; msock.socket->sk->sk_priority = TC_PRIO_INTERACTIVE; diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c index 5cffd96ef2d7..3a46b776354d 100644 --- a/drivers/block/nbd.c +++ b/drivers/block/nbd.c @@ -512,6 +512,7 @@ static int sock_xmit(struct nbd_device *nbd, int index, int send, noreclaim_flag = memalloc_noreclaim_save(); do { sock->sk->sk_allocation = GFP_NOIO | __GFP_MEMALLOC; + sock->sk->sk_use_task_frag = false; msg.msg_name = NULL; msg.msg_namelen = 0; msg.msg_control = NULL; diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 9b47dcb2a7d9..fe772d6c4c96 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1537,6 +1537,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid) queue->sock->sk->sk_rcvtimeo = 10 * HZ; queue->sock->sk->sk_allocation = GFP_ATOMIC; + queue->sock->sk->sk_use_task_frag = false; nvme_tcp_set_queue_io_cpu(queue); queue->request = NULL; queue->data_remaining = 0; diff --git a/drivers/scsi/iscsi_tcp.c b/drivers/scsi/iscsi_tcp.c index 5fb1f364e815..1d1cf641937c 100644 --- a/drivers/scsi/iscsi_tcp.c +++ b/drivers/scsi/iscsi_tcp.c @@ -738,6 +738,7 @@ iscsi_sw_tcp_conn_bind(struct iscsi_cls_session *cls_session, sk->sk_reuse = SK_CAN_REUSE; sk->sk_sndtimeo = 15 * HZ; /* FIXME: make it configurable */ sk->sk_allocation = GFP_ATOMIC; + sk->sk_use_task_frag = false; sk_set_memalloc(sk); sock_no_linger(sk); diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c index 053a2bca4c47..e15ae6ca95ea 100644 --- a/drivers/usb/usbip/usbip_common.c +++ b/drivers/usb/usbip/usbip_common.c @@ -315,6 +315,7 @@ int usbip_recv(struct socket *sock, void *buf, int size) do { sock->sk->sk_allocation = GFP_NOIO; + sock->sk->sk_use_task_frag = false; result = sock_recvmsg(sock, &msg, MSG_WAITALL); if (result <= 0) diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index eccc3cd0cb70..ac75ad18db83 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -46,6 +46,7 @@ int afs_open_socket(struct afs_net *net) goto error_1; socket->sk->sk_allocation = GFP_NOFS; + socket->sk->sk_use_task_frag = false; /* bind the callback manager's address to make this a server socket */ memset(&srx, 0, sizeof(srx)); diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 9db9527c61cf..d84f1660cacb 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -2944,6 +2944,7 @@ generic_ip_connect(struct TCP_Server_Info *server) cifs_dbg(FYI, "Socket created\n"); server->ssocket = socket; socket->sk->sk_allocation = GFP_NOFS; + socket->sk->sk_use_task_frag = false; if (sfamily == AF_INET6) cifs_reclassify_socket6(socket); else diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index 59f64c596233..120be782edbc 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -699,6 +699,7 @@ static void add_listen_sock(struct socket *sock, struct listen_connection *con) sk->sk_user_data = con; sk->sk_allocation = GFP_NOFS; + sk->sk_use_task_frag = false; /* Install a data_ready callback */ sk->sk_data_ready = lowcomms_listen_data_ready; release_sock(sk); @@ -718,6 +719,7 @@ static void add_sock(struct socket *sock, struct connection *con) sk->sk_write_space = lowcomms_write_space; sk->sk_state_change = lowcomms_state_change; sk->sk_allocation = GFP_NOFS; + sk->sk_use_task_frag = false; sk->sk_error_report = lowcomms_error_report; release_sock(sk); } diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c index f660c0dbdb63..3eaafa5e5ec4 100644 --- a/fs/ocfs2/cluster/tcp.c +++ b/fs/ocfs2/cluster/tcp.c @@ -1604,6 +1604,7 @@ static void o2net_start_connect(struct work_struct *work) sc->sc_sock = sock; /* freed by sc_kref_release */ sock->sk->sk_allocation = GFP_ATOMIC; + sock->sk->sk_use_task_frag = false; myaddr.sin_family = AF_INET; myaddr.sin_addr.s_addr = mynode->nd_ipv4_address; diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 07db2f436d44..d9120f14684b 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -868,6 +868,7 @@ static int p9_socket_open(struct p9_client *client, struct socket *csocket) } csocket->sk->sk_allocation = GFP_NOIO; + csocket->sk->sk_use_task_frag = false; file = sock_alloc_file(csocket, 0, NULL); if (IS_ERR(file)) { pr_err("%s (%d): failed to map fd\n", diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index dfa237fbd5a3..1d06e114ba3f 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -446,6 +446,7 @@ int ceph_tcp_connect(struct ceph_connection *con) if (ret) return ret; sock->sk->sk_allocation = GFP_NOFS; + sock->sk->sk_use_task_frag = false; #ifdef CONFIG_LOCKDEP lockdep_set_class(&sock->sk->sk_lock, &socket_class); diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 915b9902f673..41ffc2169743 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1882,6 +1882,7 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt, sk->sk_write_space = xs_udp_write_space; sk->sk_state_change = xs_local_state_change; sk->sk_error_report = xs_error_report; + sk->sk_use_task_frag = false; xprt_clear_connected(xprt); @@ -2082,6 +2083,7 @@ static void xs_udp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) sk->sk_user_data = xprt; sk->sk_data_ready = xs_data_ready; sk->sk_write_space = xs_udp_write_space; + sk->sk_use_task_frag = false; xprt_set_connected(xprt); @@ -2249,6 +2251,7 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock) sk->sk_state_change = xs_tcp_state_change; sk->sk_write_space = xs_tcp_write_space; sk->sk_error_report = xs_error_report; + sk->sk_use_task_frag = false; /* socket options */ sock_reset_flag(sk, SOCK_LINGER); diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c index 29a540dcb5a7..4ca2c5927ace 100644 --- a/net/xfrm/espintcp.c +++ b/net/xfrm/espintcp.c @@ -489,6 +489,7 @@ static int espintcp_init_sk(struct sock *sk) /* avoid using task_frag */ sk->sk_allocation = GFP_ATOMIC; + sk->sk_use_task_frag = false; return 0; -- 2.31.1
Powered by blists - more mailing lists