lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251009110623.3115511-1-giveme.gulu@gmail.com>
Date: Thu,  9 Oct 2025 19:06:23 +0800
From: "guangming.zhao" <giveme.gulu@...il.com>
To: miklos@...redi.hu
Cc: linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: [PATCH 5.15] fuse: Fix race condition in writethrough path A race

The race occurs as follows:
1. A write operation locks a page, fills it with new data, marks it
   Uptodate, and then immediately unlocks it within fuse_fill_write_pages().
2. This opens a window before the new data is sent to the userspace daemon.
3. A concurrent read operation for the same page may decide to re-validate
   its cache from the daemon. The fuse_wait_on_page_writeback()
   mechanism does not protect this synchronous writethrough path.
4. The read request can be processed by the multi-threaded daemon *before*
   the write request, causing it to reply with stale data from its backend.
5. The read syscall returns this stale data to userspace, causing data
   verification to fail.

This can be reliably reproduced on a mainline kernel (e.g., 6.1.x)
using iogen and a standard multi-threaded libfuse passthrough filesystem.

Steps to Reproduce:
1. Mount a libfuse passthrough filesystem (must be multi-threaded):
   $ ./passthrough /path/to/mount_point

2. Run the iogen/doio test from LTP (Linux Test Project) with mixed
   read/write operations (example):
   $ /path/to/ltp/iogen -N iogen01 -i 120s -s read,write 500k:/path/to/mount_point/file1 | \
     /path/to/ltp/doio -N iogen01 -a -v -n 2 -k

3. A data comparison error similar to the following will be reported:
   *** DATA COMPARISON ERROR ***
   check_file(/path/to/mount_point/file1, ...) failed
   expected bytes:  X:3091346:gm-arco:doio*X:3091346
   actual bytes:    91346:gm-arco:doio*C:3091346:gm-

The fix is to delay unlocking the page until after the data has been
successfully sent to the daemon. This is achieved by moving the unlock
logic from fuse_fill_write_pages() to the completion path of
fuse_send_write_pages(), ensuring the page lock is held for the entire
critical section and serializing the operations correctly.

[Note for maintainers]
This patch is created and tested against the 5.15 kernel. I have observed
that recent kernels have migrated to using folios, and I am not confident
in porting this fix to the new folio-based code myself.

I am submitting this patch to clearly document the race condition and a
proven fix on an older kernel, in the hope that a developer more
familiar with the folio conversion can adapt it for the mainline tree.

Signed-off-by: guangming.zhao <giveme.gulu@...il.com>
---
[root@...arco example]# uname -a
Linux gm-arco 6.16.8-arch3-1 #1 SMP PREEMPT_DYNAMIC Mon, 22 Sep 2025 22:08:35 +0000 x86_64 GNU/Linux
[root@...arco example]# ./passthrough /tmp/test/
[root@...arco example]# mkdir /tmp/test/yy
[root@...arco example]# /home/gm/code/ltp/testcases/kernel/fs/doio/iogen -N iogen01 -i 120s -s read,write 500b:/tmp/test/yy/kk1 1000b:/tmp/test/yy/kk2 | /home/gm/code/ltp/testcases/kernel/fs/doio/doio -N iogen01 -a -v -n 2 -k

iogen(iogen01) starting up with the following:

Out-pipe:              stdout
Iterations:            120 seconds
Seed:                  3091343
Offset-Mode:           sequential
Overlap Flag:          off
Mintrans:              1           (1 blocks)
Maxtrans:              131072      (256 blocks)
O_RAW/O_SSD Multiple:  (Determined by device)
Syscalls:              read write
Aio completion types:  none
Flags:                 buffered sync

Test Files:

Path                                          Length    iou   raw iou file
                                              (bytes) (bytes) (bytes) type
-----------------------------------------------------------------------------
/tmp/test/yy/kk1                               256000       1     512 regular
/tmp/test/yy/kk2                               512000       1     512 regular

doio(iogen01) (3091346) 17:43:50
---------------------
*** DATA COMPARISON ERROR ***
check_file(/tmp/test/yy/kk2, 116844, 106653, X:3091346:gm-arco:doio*, 23, 0) failed

Comparison fd is 3, with open flags 0
Corrupt regions follow - unprintable chars are represented as '.'
-----------------------------------------------------------------
corrupt bytes starting at file offset 116844
    1st 32 expected bytes:  X:3091346:gm-arco:doio*X:3091346
    1st 32 actual bytes:    91346:gm-arco:doio*C:3091346:gm-
Request number 13873
syscall:  write(4, 02540107176414100, 106653)
          fd 4 is file /tmp/test/yy/kk2 - open flags are 04010001
          write done at file offset 116844 - pattern is X:3091346:gm-arco:doio*

doio(iogen01) (3091344) 17:43:50
---------------------
(parent) pid 3091346 exited because of data compare errors

 fs/fuse/file.c | 36 ++++++++++--------------------------
 1 file changed, 10 insertions(+), 26 deletions(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 5c5ed58d9..a832c3122 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1098,7 +1098,6 @@ static ssize_t fuse_send_write_pages(struct fuse_io_args *ia,
 	struct fuse_file *ff = file->private_data;
 	struct fuse_mount *fm = ff->fm;
 	unsigned int offset, i;
-	bool short_write;
 	int err;
 
 	for (i = 0; i < ap->num_pages; i++)
@@ -1113,26 +1112,21 @@ static ssize_t fuse_send_write_pages(struct fuse_io_args *ia,
 	if (!err && ia->write.out.size > count)
 		err = -EIO;
 
-	short_write = ia->write.out.size < count;
 	offset = ap->descs[0].offset;
 	count = ia->write.out.size;
 	for (i = 0; i < ap->num_pages; i++) {
 		struct page *page = ap->pages[i];
 
-		if (err) {
-			ClearPageUptodate(page);
-		} else {
-			if (count >= PAGE_SIZE - offset)
-				count -= PAGE_SIZE - offset;
-			else {
-				if (short_write)
-					ClearPageUptodate(page);
-				count = 0;
-			}
-			offset = 0;
-		}
-		if (ia->write.page_locked && (i == ap->num_pages - 1))
-			unlock_page(page);
+        if (!err && !offset && count >= PAGE_SIZE)
+            SetPageUptodate(page);
+
+        if (count > PAGE_SIZE - offset)
+            count -= PAGE_SIZE - offset;
+        else
+            count = 0;
+        offset = 0;
+
+        unlock_page(page);
 		put_page(page);
 	}
 
@@ -1195,16 +1189,6 @@ static ssize_t fuse_fill_write_pages(struct fuse_io_args *ia,
 		if (offset == PAGE_SIZE)
 			offset = 0;
 
-		/* If we copied full page, mark it uptodate */
-		if (tmp == PAGE_SIZE)
-			SetPageUptodate(page);
-
-		if (PageUptodate(page)) {
-			unlock_page(page);
-		} else {
-			ia->write.page_locked = true;
-			break;
-		}
 		if (!fc->big_writes)
 			break;
 	} while (iov_iter_count(ii) && count < fc->max_write &&
-- 
2.51.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ