lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2807750.1752144428@warthog.procyon.org.uk>
Date: Thu, 10 Jul 2025 11:47:08 +0100
From: David Howells <dhowells@...hat.com>
To: Max Kellermann <max.kellermann@...os.com>
Cc: dhowells@...hat.com, Christian Brauner <christian@...uner.io>,
    Viacheslav Dubeyko <slava@...eyko.com>,
    Alex Markuze <amarkuze@...hat.com>, Steve French <sfrench@...ba.org>,
    Paulo Alcantara <pc@...guebit.com>, netfs@...ts.linux.dev,
    linux-afs@...ts.infradead.org, linux-cifs@...r.kernel.org,
    linux-nfs@...r.kernel.org, ceph-devel@...r.kernel.org,
    v9fs@...ts.linux.dev, linux-fsdevel@...r.kernel.org,
    linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: Re: [PATCH 00/13] netfs, cifs: Fixes to retry-related code

Hi Max,

I managed to reproduce it on my test machine with ceph + fscache.

Does this fix the problem for you?

David
---
netfs: Fix copy-to-cache so that it performs collection with ceph+fscache

The netfs copy-to-cache that is used by Ceph with local caching sets up a
new request to write data just read to the cache.  The request is started
and then left to look after itself whilst the app continues.  The request
gets notified by the backing fs upon completion of the async DIO write, but
then tries to wake up the app because NETFS_RREQ_OFFLOAD_COLLECTION isn't
set - but the app isn't waiting there, and so the request just hangs.

Fix this by setting NETFS_RREQ_OFFLOAD_COLLECTION which causes the
notification from the backing filesystem to put the collection onto a work
queue instead.

Fixes: e2d46f2ec332 ("netfs: Change the read result collector to only use one work item")
Reported-by: Max Kellermann <max.kellermann@...os.com>
Link: https://lore.kernel.org/r/CAKPOu+8z_ijTLHdiCYGU_Uk7yYD=shxyGLwfe-L7AV3DhebS3w@mail.gmail.com/
Signed-off-by: David Howells <dhowells@...hat.com>
cc: Paulo Alcantara <pc@...guebit.org>
cc: Viacheslav Dubeyko <slava@...eyko.com>
cc: Alex Markuze <amarkuze@...hat.com>
cc: Ilya Dryomov <idryomov@...il.com>
cc: netfs@...ts.linux.dev
cc: ceph-devel@...r.kernel.org
cc: linux-fsdevel@...r.kernel.org
cc: stable@...r.kernel.org
---
 fs/netfs/read_pgpriv2.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c
index 5bbe906a551d..080d2a6a51d9 100644
--- a/fs/netfs/read_pgpriv2.c
+++ b/fs/netfs/read_pgpriv2.c
@@ -110,6 +110,7 @@ static struct netfs_io_request *netfs_pgpriv2_begin_copy_to_cache(
 	if (!creq->io_streams[1].avail)
 		goto cancel_put;
 
+	__set_bit(NETFS_RREQ_OFFLOAD_COLLECTION, &creq->flags);
 	trace_netfs_write(creq, netfs_write_trace_copy_to_cache);
 	netfs_stat(&netfs_n_wh_copy_to_cache);
 	rreq->copy_to_cache = creq;


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ