[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAH2r5msJPp9PAvnjOVBOBjjZ7skWMNgH7j2s34uR3oyLxBOVug@mail.gmail.com>
Date: Mon, 16 Jun 2025 11:42:33 -0500
From: Steve French <smfrench@...il.com>
To: David Howells <dhowells@...hat.com>
Cc: Paulo Alcantara <pc@...guebit.org>, linux-cifs@...r.kernel.org, netfs@...ts.linux.dev,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] netfs: Fix hang due to missing case in final DIO read
result collection
running tests with this now
On Mon, Jun 16, 2025 at 6:36 AM David Howells <dhowells@...hat.com> wrote:
>
> When doing a DIO read, if the subrequests we issue fail and cause the
> request PAUSE flag to be set to put a pause on subrequest generation, we
> may complete collection of the subrequests (possibly discarding them) prior
> to the ALL_QUEUED flags being set.
>
> In such a case, netfs_read_collection() doesn't see ALL_QUEUED being set
> after netfs_collect_read_results() returns and will just return to the app
> (the collector can be seen unpausing the generator in the trace log).
>
> The subrequest generator can then set ALL_QUEUED and the app thread reaches
> netfs_wait_for_request(). This causes netfs_collect_in_app() to be called
> to see if we're done yet, but there's missing case here.
>
> netfs_collect_in_app() will see that a thread is active and set inactive to
> false, but won't see any subrequests in the read stream, and so won't set
> need_collect to true. The function will then just return 0, indicating
> that the caller should just sleep until further activity (which won't be
> forthcoming) occurs.
>
> Fix this by making netfs_collect_in_app() check to see if an active thread
> is complete - i.e. that ALL_QUEUED is set and the subrequests list is empty
> - and to skip the sleep return path. The collector will then be called
> which will clear the request IN_PROGRESS flag, allowing the app to
> progress.
>
> Fixes: 2b1424cd131c ("netfs: Fix wait/wake to be consistent about the waitqueue used")
> Reported-by: Steve French <sfrench@...ba.org>
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Paulo Alcantara <pc@...guebit.org>
> cc: linux-cifs@...r.kernel.org
> cc: netfs@...ts.linux.dev
> cc: linux-fsdevel@...r.kernel.org
> ---
> diff --git a/fs/netfs/misc.c b/fs/netfs/misc.c
> index 43b67a28a8fa..1966dfba285e 100644
> --- a/fs/netfs/misc.c
> +++ b/fs/netfs/misc.c
> @@ -381,7 +381,7 @@ void netfs_wait_for_in_progress_stream(struct netfs_io_request *rreq,
> static int netfs_collect_in_app(struct netfs_io_request *rreq,
> bool (*collector)(struct netfs_io_request *rreq))
> {
> - bool need_collect = false, inactive = true;
> + bool need_collect = false, inactive = true, done = true;
>
> for (int i = 0; i < NR_IO_STREAMS; i++) {
> struct netfs_io_subrequest *subreq;
> @@ -400,9 +400,11 @@ static int netfs_collect_in_app(struct netfs_io_request *rreq,
> need_collect = true;
> break;
> }
> + if (subreq || test_bit(NETFS_RREQ_ALL_QUEUED, &rreq->flags))
> + done = false;
> }
>
> - if (!need_collect && !inactive)
> + if (!need_collect && !inactive && !done)
> return 0; /* Sleep */
>
> __set_current_state(TASK_RUNNING);
>
>
--
Thanks,
Steve
Powered by blists - more mailing lists