[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d220402a232e204676d9100d6fe4c2ae08f753ee.camel@redhat.com>
Date: Fri, 09 Dec 2022 13:37:08 +0100
From: Paolo Abeni <pabeni@...hat.com>
To: Benjamin Coddington <bcodding@...hat.com>, netdev@...r.kernel.org
Cc: linux-kernel@...r.kernel.org,
Philipp Reisner <philipp.reisner@...bit.com>,
Lars Ellenberg <lars.ellenberg@...bit.com>,
Christoph Böhmwalder
<christoph.boehmwalder@...bit.com>, Jens Axboe <axboe@...nel.dk>,
Josef Bacik <josef@...icpanda.com>,
Keith Busch <kbusch@...nel.org>,
Christoph Hellwig <hch@....de>,
Sagi Grimberg <sagi@...mberg.me>,
Lee Duncan <lduncan@...e.com>, Chris Leech <cleech@...hat.com>,
Mike Christie <michael.christie@...cle.com>,
"James E.J. Bottomley" <jejb@...ux.ibm.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Valentina Manea <valentina.manea.m@...il.com>,
Shuah Khan <shuah@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
David Howells <dhowells@...hat.com>,
Marc Dionne <marc.dionne@...istor.com>,
Steve French <sfrench@...ba.org>,
Christine Caulfield <ccaulfie@...hat.com>,
David Teigland <teigland@...hat.com>,
Mark Fasheh <mark@...heh.com>,
Joel Becker <jlbec@...lplan.org>,
Joseph Qi <joseph.qi@...ux.alibaba.com>,
Eric Van Hensbergen <ericvh@...il.com>,
Latchesar Ionkov <lucho@...kov.net>,
Dominique Martinet <asmadeus@...ewreck.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Ilya Dryomov <idryomov@...il.com>,
Xiubo Li <xiubli@...hat.com>,
Trond Myklebust <trond.myklebust@...merspace.com>,
Anna Schumaker <anna@...nel.org>,
Chuck Lever <chuck.lever@...cle.com>,
Jeff Layton <jlayton@...nel.org>, drbd-dev@...ts.linbit.com,
linux-block@...r.kernel.org, nbd@...er.debian.org,
linux-nvme@...ts.infradead.org, open-iscsi@...glegroups.com,
linux-scsi@...r.kernel.org, linux-usb@...r.kernel.org,
linux-afs@...ts.infradead.org, linux-cifs@...r.kernel.org,
samba-technical@...ts.samba.org, cluster-devel@...hat.com,
ocfs2-devel@....oracle.com, v9fs-developer@...ts.sourceforge.net,
ceph-devel@...r.kernel.org, linux-nfs@...r.kernel.org
Subject: Re: [PATCH v1 2/3] Treewide: Stop corrupting socket's task_frag
On Mon, 2022-11-21 at 08:35 -0500, Benjamin Coddington wrote:
> Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the
> GFP_NOIO flag on sk_allocation which the networking system uses to decide
> when it is safe to use current->task_frag. The results of this are
> unexpected corruption in task_frag when SUNRPC is involved in memory
> reclaim.
>
> The corruption can be seen in crashes, but the root cause is often
> difficult to ascertain as a crashing machine's stack trace will have no
> evidence of being near NFS or SUNRPC code. I believe this problem to
> be much more pervasive than reports to the community may indicate.
>
> Fix this by having kernel users of sockets that may corrupt task_frag due
> to reclaim set sk_use_task_frag = false. Preemptively correcting this
> situation for users that still set sk_allocation allows them to convert to
> memalloc_nofs_save/restore without the same unexpected corruptions that are
> sure to follow, unlikely to show up in testing, and difficult to bisect.
>
> CC: Philipp Reisner <philipp.reisner@...bit.com>
> CC: Lars Ellenberg <lars.ellenberg@...bit.com>
> CC: "Christoph Böhmwalder" <christoph.boehmwalder@...bit.com>
> CC: Jens Axboe <axboe@...nel.dk>
> CC: Josef Bacik <josef@...icpanda.com>
> CC: Keith Busch <kbusch@...nel.org>
> CC: Christoph Hellwig <hch@....de>
> CC: Sagi Grimberg <sagi@...mberg.me>
> CC: Lee Duncan <lduncan@...e.com>
> CC: Chris Leech <cleech@...hat.com>
> CC: Mike Christie <michael.christie@...cle.com>
> CC: "James E.J. Bottomley" <jejb@...ux.ibm.com>
> CC: "Martin K. Petersen" <martin.petersen@...cle.com>
> CC: Valentina Manea <valentina.manea.m@...il.com>
> CC: Shuah Khan <shuah@...nel.org>
> CC: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> CC: David Howells <dhowells@...hat.com>
> CC: Marc Dionne <marc.dionne@...istor.com>
> CC: Steve French <sfrench@...ba.org>
> CC: Christine Caulfield <ccaulfie@...hat.com>
> CC: David Teigland <teigland@...hat.com>
> CC: Mark Fasheh <mark@...heh.com>
> CC: Joel Becker <jlbec@...lplan.org>
> CC: Joseph Qi <joseph.qi@...ux.alibaba.com>
> CC: Eric Van Hensbergen <ericvh@...il.com>
> CC: Latchesar Ionkov <lucho@...kov.net>
> CC: Dominique Martinet <asmadeus@...ewreck.org>
> CC: "David S. Miller" <davem@...emloft.net>
> CC: Eric Dumazet <edumazet@...gle.com>
> CC: Jakub Kicinski <kuba@...nel.org>
> CC: Paolo Abeni <pabeni@...hat.com>
> CC: Ilya Dryomov <idryomov@...il.com>
> CC: Xiubo Li <xiubli@...hat.com>
> CC: Chuck Lever <chuck.lever@...cle.com>
> CC: Jeff Layton <jlayton@...nel.org>
> CC: Trond Myklebust <trond.myklebust@...merspace.com>
> CC: Anna Schumaker <anna@...nel.org>
> CC: drbd-dev@...ts.linbit.com
> CC: linux-block@...r.kernel.org
> CC: linux-kernel@...r.kernel.org
> CC: nbd@...er.debian.org
> CC: linux-nvme@...ts.infradead.org
> CC: open-iscsi@...glegroups.com
> CC: linux-scsi@...r.kernel.org
> CC: linux-usb@...r.kernel.org
> CC: linux-afs@...ts.infradead.org
> CC: linux-cifs@...r.kernel.org
> CC: samba-technical@...ts.samba.org
> CC: cluster-devel@...hat.com
> CC: ocfs2-devel@....oracle.com
> CC: v9fs-developer@...ts.sourceforge.net
> CC: netdev@...r.kernel.org
> CC: ceph-devel@...r.kernel.org
> CC: linux-nfs@...r.kernel.org
>
> Suggested-by: Guillaume Nault <gnault@...hat.com>
> Signed-off-by: Benjamin Coddington <bcodding@...hat.com>
I think this is the most feasible way out of the existing issue, and I
think this patchset should go via the networking tree, targeting the
Linux 6.2.
If someone has disagreement with the above, please speak!
Thanks,
Paolo
Powered by blists - more mailing lists