[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250519090707.2848510-3-dhowells@redhat.com>
Date: Mon, 19 May 2025 10:07:02 +0100
From: David Howells <dhowells@...hat.com>
To: Christian Brauner <christian@...uner.io>
Cc: David Howells <dhowells@...hat.com>,
Paulo Alcantara <pc@...guebit.com>,
netfs@...ts.linux.dev,
linux-afs@...ts.infradead.org,
linux-cifs@...r.kernel.org,
linux-nfs@...r.kernel.org,
ceph-devel@...r.kernel.org,
v9fs@...ts.linux.dev,
linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org,
Nicolas Baranger <nicolas.baranger@....fr>
Subject: [PATCH 2/4] netfs: Fix setting of transferred bytes with short DIO reads
From: Paulo Alcantara <pc@...guebit.com>
A netfslib request comprises an ordered stream of subrequests that, when
doing an unbuffered/DIO read, are contiguous. The subrequests may be
performed in parallel, but may not be fully completed.
For instance, if we try and make a 256KiB DIO read from a 3-byte file with
a 64KiB rsize and 256KiB bsize, netfslib will attempt to make a read of
256KiB, broken up into four 64KiB subreads, with the expectation that the
first will be short and the subsequent three be completely devoid - but we
do all four on the basis that the file may have been changed by a third
party.
The read-collection code, however, walks through all the subreqs and
advances the notion of how much data has been read in the stream to the
start of each subreq plus its amount transferred (which are 3, 0, 0, 0 for
the example above) - which gives an amount apparently read of 3*64KiB -
which is incorrect.
Fix the collection code to cut short the calculation of the transferred
amount with the first short subrequest in an unbuffered read; everything
beyond that must be ignored as there's a hole that cannot be filled. This
applies both to shortness due to hitting the EOF and shortness due to an
error.
This is achieved by setting a flag on the request when we collect the first
short subrequest (collection is done in ascending order).
This can be tested by mounting a cifs volume with rsize=65536,bsize=262144
and doing a 256k DIO read of a very small file (e.g. 3 bytes). read()
should return 3, not >3.
This problem came in when netfs_read_collection() set rreq->transferred to
stream->transferred, even for DIO. Prior to that, netfs_rreq_assess_dio()
just went over the list and added up the subreqs till it met a short one -
but now the subreqs are discarded earlier.
Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Reported-by: Nicolas Baranger <nicolas.baranger@....fr>
Closes: https://lore.kernel.org/all/10bec2430ed4df68bde10ed95295d093@3xo.fr/
Signed-off-by: Paulo Alcantara (Red Hat) <pc@...guebit.com>
Signed-off-by: David Howells <dhowells@...hat.com>
cc: netfs@...ts.linux.dev
cc: linux-fsdevel@...r.kernel.org
---
fs/netfs/read_collect.c | 21 +++++----------------
include/linux/netfs.h | 1 +
2 files changed, 6 insertions(+), 16 deletions(-)
diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c
index 23c75755ad4e..d3cf27b2697c 100644
--- a/fs/netfs/read_collect.c
+++ b/fs/netfs/read_collect.c
@@ -280,9 +280,13 @@ static void netfs_collect_read_results(struct netfs_io_request *rreq)
stream->need_retry = true;
notes |= NEED_RETRY | MADE_PROGRESS;
break;
+ } else if (test_bit(NETFS_RREQ_SHORT_TRANSFER, &rreq->flags)) {
+ notes |= MADE_PROGRESS;
} else {
if (!stream->failed)
- stream->transferred = stream->collected_to - rreq->start;
+ stream->transferred += transferred;
+ if (front->transferred < front->len)
+ set_bit(NETFS_RREQ_SHORT_TRANSFER, &rreq->flags);
notes |= MADE_PROGRESS;
}
@@ -342,23 +346,8 @@ static void netfs_collect_read_results(struct netfs_io_request *rreq)
*/
static void netfs_rreq_assess_dio(struct netfs_io_request *rreq)
{
- struct netfs_io_subrequest *subreq;
- struct netfs_io_stream *stream = &rreq->io_streams[0];
unsigned int i;
- /* Collect unbuffered reads and direct reads, adding up the transfer
- * sizes until we find the first short or failed subrequest.
- */
- list_for_each_entry(subreq, &stream->subrequests, rreq_link) {
- rreq->transferred += subreq->transferred;
-
- if (subreq->transferred < subreq->len ||
- test_bit(NETFS_SREQ_FAILED, &subreq->flags)) {
- rreq->error = subreq->error;
- break;
- }
- }
-
if (rreq->origin == NETFS_DIO_READ) {
for (i = 0; i < rreq->direct_bv_count; i++) {
flush_dcache_page(rreq->direct_bv[i].bv_page);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index c86a11cfc4a3..497c4f4698f6 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -279,6 +279,7 @@ struct netfs_io_request {
#define NETFS_RREQ_USE_IO_ITER 12 /* Use ->io_iter rather than ->i_pages */
#define NETFS_RREQ_ALL_QUEUED 13 /* All subreqs are now queued */
#define NETFS_RREQ_RETRYING 14 /* Set if we're in the retry path */
+#define NETFS_RREQ_SHORT_TRANSFER 15 /* Set if we have a short transfer */
#define NETFS_RREQ_USE_PGPRIV2 31 /* [DEPRECATED] Use PG_private_2 to mark
* write to cache on read */
const struct netfs_request_ops *netfs_ops;
Powered by blists - more mailing lists