[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d98ca4470c447182020b576841115a20@3xo.fr>
Date: Tue, 07 Jan 2025 09:26:28 +0100
From: nicolas.baranger@....fr
To: David Howells <dhowells@...hat.com>
Cc: Steve French <smfrench@...il.com>, Christoph Hellwig
<hch@...radead.org>, Jeff Layton <jlayton@...nel.org>, Christian Brauner
<brauner@...nel.org>, netfs@...ts.linux.dev, linux-cifs@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] netfs: Fix kernel async DIO
Hi David
As your patch was written on top on linux-next I was required to make
some small modifications to make it work on mainline (6.13-rc6).
The following patch is working fine for me on mainline, but i think it
would be better to wait for your confirmation / validation (or new
patch) before applying it on production.
#-------- PATCH --------#
diff --git a/linux-6.13-rc6/nba/_orig_fs.netfs.direct_write.c
b/linux-6.13-rc6/fs/netfs/direct_write.c
index 88f2adf..94a1ee8 100644
--- a/linux-6.13-rc6/nba/_orig_fs.netfs.direct_write.c
+++ b/linux-6.13-rc6/fs/netfs/direct_write.c
@@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
kiocb *iocb, struct iov_iter *
* allocate a sufficiently large bvec array and may
shorten the
* request.
*/
- if (async || user_backed_iter(iter)) {
+ if (user_backed_iter(iter)) {
n = netfs_extract_user_iter(iter, len,
&wreq->iter, 0);
if (n < 0) {
ret = n;
@@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
kiocb *iocb, struct iov_iter *
wreq->direct_bv_count = n;
wreq->direct_bv_unpin =
iov_iter_extract_will_pin(iter);
} else {
+ /* If this is a kernel-generated async DIO
request,
+ * assume that any resources the iterator points
to
+ * (eg. a bio_vec array) will persist till the
end of
+ * the op.
+ */
wreq->iter = *iter;
}
#-------- TESTS --------#
Using this patch Linux 6.13-rc6 build with no error and '--direct-io=on'
is working :
18:38:47 root@...12-lab-10d:~# uname -a
Linux deb12-lab-10d.lab.lan 6.13.0-rc6-amd64 #0 SMP PREEMPT_DYNAMIC Mon
Jan 6 18:14:07 CET 2025 x86_64 GNU/Linux
18:39:29 root@...12-lab-10d:~# losetup
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE
DIO LOG-SEC
/dev/loop2046 0 0 0 0
/mnt/FBX24T/FS-LAN/bckcrypt2046 1 4096
18:39:32 root@...12-lab-10d:~# dmsetup ls | grep bckcrypt
bckcrypt (254:7)
18:39:55 root@...12-lab-10d:~# cryptsetup status bckcrypt
/dev/mapper/bckcrypt is active and is in use.
type: LUKS2
cipher: aes-xts-plain64
keysize: 512 bits
key location: keyring
device: /dev/loop2046
loop: /mnt/FBX24T/FS-LAN/bckcrypt2046
sector size: 512
offset: 32768 sectors
size: 8589901824 sectors
mode: read/write
18:40:36 root@...12-lab-10d:~# df -h | egrep 'cifs|bckcrypt'
//10.0.10.100/FBX24T cifs 22T 13T 9,0T 60% /mnt/FBX24T
/dev/mapper/bckcrypt btrfs 4,0T 3,3T 779G 82%
/mnt/bckcrypt
09:08:44 root@...12-lab-10d:~# LANG=en_US.UTF-8
09:08:46 root@...12-lab-10d:~# dd if=/dev/zero
of=/mnt/bckcrypt/test/test.dd bs=256M count=16 oflag=direct
status=progress
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 14 s, 302 MB/s
16+0 records in
16+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 14.2061 s, 302 MB/s
No write errors using '--direct-io=on' option of losetup with this patch
=> writing to the back-file is more than 20x faster ...
It seems to be ok !
Let me know if something's wrong in this patch or if it can safely be
used in production.
Again thanks everyone for help.
Nicolas
Le 2025-01-06 13:07, nicolas.baranger@....fr a écrit :
> Hi David
>
> Thanks for the job !
> I will buid Linux 6.10 and mainline with the provided change and I'm
> comming here as soon as I get results from tests (CET working time).
>
> Thanks again for help in this issue
> Nicolas
>
> Le 2025-01-06 12:37, David Howells a écrit :
>
>> Hi Nicolas,
>>
>> Does the attached fix your problem?
>>
>> David
>> ---
>> netfs: Fix kernel async DIO
>>
>> Netfslib needs to be able to handle kernel-initiated asynchronous DIO
>> that
>> is supplied with a bio_vec[] array. Currently, because of the async
>> flag,
>> this gets passed to netfs_extract_user_iter() which throws a warning
>> and
>> fails because it only handles IOVEC and UBUF iterators. This can be
>> triggered through a combination of cifs and a loopback blockdev with
>> something like:
>>
>> mount //my/cifs/share /foo
>> dd if=/dev/zero of=/foo/m0 bs=4K count=1K
>> losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0
>> echo hello >/dev/loop2046
>>
>> This causes the following to appear in syslog:
>>
>> WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50
>> netfs_extract_user_iter+0x170/0x250 [netfs]
>>
>> and the write to fail.
>>
>> Fix this by removing the check in netfs_unbuffered_write_iter_locked()
>> that
>> causes async kernel DIO writes to be handled as userspace writes.
>> Note
>> that this change relies on the kernel caller maintaining the existence
>> of
>> the bio_vec array (or kvec[] or folio_queue) until the op is complete.
>>
>> Fixes: 153a9961b551 ("netfs: Implement unbuffered/DIO write support")
>> Reported by: Nicolas Baranger <nicolas.baranger@....fr>
>> Closes:
>> https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/
>> Signed-off-by: David Howells <dhowells@...hat.com>
>> cc: Steve French <smfrench@...il.com>
>> cc: Jeff Layton <jlayton@...nel.org>
>> cc: netfs@...ts.linux.dev
>> cc: linux-cifs@...r.kernel.org
>> cc: linux-fsdevel@...r.kernel.org
>> ---
>> fs/netfs/direct_write.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c
>> index eded8afaa60b..42ce53cc216e 100644
>> --- a/fs/netfs/direct_write.c
>> +++ b/fs/netfs/direct_write.c
>> @@ -67,7 +67,7 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
>> kiocb *iocb, struct iov_iter *
>> * allocate a sufficiently large bvec array and may shorten the
>> * request.
>> */
>> - if (async || user_backed_iter(iter)) {
>> + if (user_backed_iter(iter)) {
>> n = netfs_extract_user_iter(iter, len, &wreq->buffer.iter, 0);
>> if (n < 0) {
>> ret = n;
>> @@ -77,6 +77,11 @@ ssize_t netfs_unbuffered_write_iter_locked(struct
>> kiocb *iocb, struct iov_iter *
>> wreq->direct_bv_count = n;
>> wreq->direct_bv_unpin = iov_iter_extract_will_pin(iter);
>> } else {
>> + /* If this is a kernel-generated async DIO request,
>> + * assume that any resources the iterator points to
>> + * (eg. a bio_vec array) will persist till the end of
>> + * the op.
>> + */
>> wreq->buffer.iter = *iter;
>> }
>> }
Powered by blists - more mailing lists