[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260205185842.1833542-1-clm@meta.com>
Date: Thu, 5 Feb 2026 10:57:15 -0800
From: Chris Mason <clm@...a.com>
To: "Darrick J. Wong" <djwong@...nel.org>
CC: <miklos@...redi.hu>, <joannelkoong@...il.com>, <bernd@...ernd.com>,
<neal@...pa.dev>, <linux-ext4@...r.kernel.org>,
<linux-fsdevel@...r.kernel.org>
Subject: Re: [PATCH 29/31] fuse: disable direct reclaim for any fuse server that uses iomap
"Darrick J. Wong" <djwong@...nel.org> wrote:
> From: Darrick J. Wong <djwong@...nel.org>
>
> Any fuse server that uses iomap can create a substantial amount of dirty
> pages in the pagecache because we don't write dirty stuff until reclaim
> or fsync. Therefore, memory reclaim on any fuse iomap server musn't
> ever recurse back into the same filesystem. We must also never throttle
> the fuse server writes to a bdi because that will just slow down
> metadata operations.
Hi everyone,
I'm trying out my AI review prompts on a few more trees, and I ran it
on the fuse-iomap-cache branch:
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=fuse-iomap-cache
This was the latest email lore could find, apologies if I grabbed the wrong one.
Hopefully the review below is useful, please let me know if there are places it
has gone wildly wrong. I kept the header to make it obvious exactly which
commit I'd run through:
commit 3bdd586d6fa00e067dbc3676642ac75e54ec3031
Author: Darrick J. Wong <djwong@...nel.org>
fuse: disable direct reclaim for any fuse server that uses iomap
Any fuse server that uses iomap can create a substantial amount of dirty
pages in the pagecache because we don't write dirty stuff until reclaim
or fsync. Therefore, memory reclaim on any fuse iomap server must avoid
recursing back into the same filesystem and throttling the fuse server.
> diff --git a/fs/fuse/fuse_iomap.c b/fs/fuse/fuse_iomap.c
> --- a/fs/fuse/fuse_iomap.c
> +++ b/fs/fuse/fuse_iomap.c
> @@ -1019,6 +1019,12 @@ static void fuse_iomap_config_reply(struct fuse_mount *fm,
> */
> fc->sync_fs = true;
>
> + /*
> + * We could be on the hook for a substantial amount of writeback, so
> + * prohibit reclaim from recursing into fuse or the kernel from
> + * throttling any bdis that the fuse server might write to.
> + */
> + current->flags |= PF_MEMALLOC_NOFS | PF_LOCAL_THROTTLE;
Are these process flags being set on the correct process? Looking at the
call chain:
Sync path:
fuse_fill_super() -> fuse_iomap_mount() -> fuse_iomap_config_reply()
Async path:
fuse_request_end() -> args->end() -> fuse_iomap_config_reply()
In the sync path, current points to the mount(2) process which exits
after mounting completes. In the async path, current points to whichever
process writes the reply to /dev/fuse (the fuse daemon thread) or a
kernel worker thread for virtio-fs.
The commit message states the goal is to prohibit reclaim from recursing
into the fuse server, but the fuse server is a separate userspace process.
Setting flags on the mount process (which exits) or on a random fuse
daemon thread (only once during init) doesn't protect the fuse server
threads that actually perform I/O operations.
Should the flags be set on the correct process context? Or if they need
to be set here, should they follow the pattern used in fs/nfsd/vfs.c and
fs/nfs/localio.c which save and restore the flags?
Looking at fs/nfsd/vfs.c:1438-1509:
unsigned long old_flags = current->flags;
current->flags |= PF_LOCAL_THROTTLE;
...
current_restore_flags(pflags, PF_LOCAL_THROTTLE);
and fs/nfs/localio.c:824-828:
unsigned long old_flags = current->flags;
current->flags |= PF_LOCAL_THROTTLE;
...
current_restore_flags(old_flags, PF_LOCAL_THROTTLE);
Without saving and restoring, the flags permanently alter the process
state. For the mount process this is wasted when it exits, and for fuse
daemon threads this causes permanent behavior changes.
Powered by blists - more mailing lists