[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <3431541.1740763345@warthog.procyon.org.uk>
Date: Fri, 28 Feb 2025 17:22:25 +0000
From: David Howells <dhowells@...hat.com>
To: Steve French <stfrench@...rosoft.com>
cc: dhowells@...hat.com, Paulo Alcantara <pc@...guebit.com>,
Christian Brauner <brauner@...nel.org>,
Jeff Layton <jlayton@...nel.org>, linux-cifs@...r.kernel.org,
netfs@...ts.linux.dev, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: [PATCH] netfs: Fix collection of results during pause when collection offloaded
A netfs read request can run in one of two modes: for synchronous reads
writes, the app thread does the collection of results and for asynchronous
reads, this is offloaded to a worker thread. This is controlled by the
NETFS_RREQ_OFFLOAD_COLLECTION flag.
Now, if a subrequest incurs an error, the NETFS_RREQ_PAUSE flag is set to
stop the issuing loop temporarily from issuing more subrequests until a
retry is successful or the request is abandoned.
When the issuing loop sees NETFS_RREQ_PAUSE, it jumps to
netfs_wait_for_pause() which will wait for the PAUSE flag to be cleared -
and whilst it is waiting, it will call out to the collector as more results
acrue... But this is the wrong thing to do if OFFLOAD_COLLECTION is set as
we can then end up with both the app thread and the work item collecting
results simultaneously.
This manifests itself occasionally when running the generic/323 xfstest
against multichannel cifs as an oops that's a bit random but frequently
involving io_submit() (the test does lots of simultaneous async DIO reads).
Fix this by only doing the collection in netfs_wait_for_pause() if the
NETFS_RREQ_OFFLOAD_COLLECTION is not set.
Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Reported-by: Steve French <stfrench@...rosoft.com>
Signed-off-by: David Howells <dhowells@...hat.com>
cc: Paulo Alcantara <pc@...guebit.com>
cc: Jeff Layton <jlayton@...nel.org>
cc: linux-cifs@...r.kernel.org
cc: netfs@...ts.linux.dev
cc: linux-fsdevel@...r.kernel.org
---
fs/netfs/read_collect.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 636cc5a98ef5..23c75755ad4e 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -682,14 +682,16 @@ void netfs_wait_for_pause(struct netfs_io_request *rreq)
trace_netfs_rreq(rreq, netfs_rreq_trace_wait_queue);
prepare_to_wait(&rreq->waitq, &myself, TASK_UNINTERRUPTIBLE);
- subreq = list_first_entry_or_null(&stream->subrequests,
- struct netfs_io_subrequest, rreq_link);
- if (subreq &&
- (!test_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags) ||
- test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags))) {
- __set_current_state(TASK_RUNNING);
- netfs_read_collection(rreq);
- continue;
+ if (!test_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &rreq->flags)) {
+ subreq = list_first_entry_or_null(&stream->subrequests,
+ struct netfs_io_subrequest, rreq_link);
+ if (subreq &&
+ (!test_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags) ||
+ test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags))) {
+ __set_current_state(TASK_RUNNING);
+ netfs_read_collection(rreq);
+ continue;
+ }
}
if (!test_bit(NETFS_RREQ_IN_PROGRESS, &rreq->flags) ||
Powered by blists - more mailing lists