lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJnrk1ZYF7MG0mBZ4GRdKfmSiEEx3vXxgiH3oYdMS-neWSA2mw@mail.gmail.com>
Date: Mon, 26 Jan 2026 17:35:05 -0800
From: Joanne Koong <joannelkoong@...il.com>
To: "Darrick J. Wong" <djwong@...nel.org>
Cc: miklos@...redi.hu, bernd@...ernd.com, neal@...pa.dev, 
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 17/31] fuse: use an unrestricted backing device with iomap
 pagecache io

On Mon, Jan 26, 2026 at 3:55 PM Darrick J. Wong <djwong@...nel.org> wrote:
>
> On Mon, Jan 26, 2026 at 02:03:35PM -0800, Joanne Koong wrote:
> > On Tue, Oct 28, 2025 at 5:49 PM Darrick J. Wong <djwong@...nel.org> wrote:
> > >
> > > From: Darrick J. Wong <djwong@...nel.org>
> > >
> > > With iomap support turned on for the pagecache, the kernel issues
> > > writeback to directly to block devices and we no longer have to push all
> > > those pages through the fuse device to userspace.  Therefore, we don't
> > > need the tight dirty limits (~1M) that are used for regular fuse.  This
> > > dramatically increases the performance of fuse's pagecache IO.
> > >
> > > Signed-off-by: "Darrick J. Wong" <djwong@...nel.org>
> > > ---
> > >  fs/fuse/file_iomap.c |   21 +++++++++++++++++++++
> > >  1 file changed, 21 insertions(+)
> > >
> > >
> > > diff --git a/fs/fuse/file_iomap.c b/fs/fuse/file_iomap.c
> > > index 0bae356045638b..a9bacaa0991afa 100644
> > > --- a/fs/fuse/file_iomap.c
> > > +++ b/fs/fuse/file_iomap.c
> > > @@ -713,6 +713,27 @@ const struct fuse_backing_ops fuse_iomap_backing_ops = {
> > >  void fuse_iomap_mount(struct fuse_mount *fm)
> > >  {
> > >         struct fuse_conn *fc = fm->fc;
> > > +       struct super_block *sb = fm->sb;
> > > +       struct backing_dev_info *old_bdi = sb->s_bdi;
> > > +       char *suffix = sb->s_bdev ? "-fuseblk" : "-fuse";
> > > +       int res;
> > > +
> > > +       /*
> > > +        * sb->s_bdi points to the initial private bdi.  However, we want to
> > > +        * redirect it to a new private bdi with default dirty and readahead
> > > +        * settings because iomap writeback won't be pushing a ton of dirty
> > > +        * data through the fuse device.  If this fails we fall back to the
> > > +        * initial fuse bdi.
> > > +        */
> > > +       sb->s_bdi = &noop_backing_dev_info;
> > > +       res = super_setup_bdi_name(sb, "%u:%u%s.iomap", MAJOR(fc->dev),
> > > +                                  MINOR(fc->dev), suffix);
> > > +       if (res) {
> > > +               sb->s_bdi = old_bdi;
> > > +       } else {
> > > +               bdi_unregister(old_bdi);
> > > +               bdi_put(old_bdi);
> > > +       }
> >
> > Maybe I'm missing something here, but isn't sb->s_bdi already set to
> > noop_backing_dev_info when fuse_iomap_mount() is called?
> > fuse_fill_super() -> fuse_fill_super_common() -> fuse_bdi_init() does
> > this already before the fuse_iomap_mount() call, afaict.
>
> Right.
>
> > I think what we need to do is just unset BDI_CAP_STRICTLIMIT and
> > adjust the bdi max ratio?
>
> That's sufficient to undo the effects of fuse_bdi_init, yes.  However
> the BDI gets created with the name "$major:$minor{-fuseblk}" and there
> are "management" scripts that try to tweak fuse BDIs for better
> performance.
>
> I don't want some dumb script to mismanage a fuse-iomap filesystem
> because it can't tell the difference, so I create a new bdi with the
> name "$major:$minor.iomap" to make it obvious.  But super_setup_bdi_name
> gets cranky if s_bdi isn't set to noop and we don't want to fail a mount
> here due to ENOMEM so ... I implemented this weird switcheroo code.

I see. It might be useful to copy/paste this into the commit message
just for added context. I don't see a better way of doing it than what
you have in this patch then since we rely on the init reply to know
whether iomap should be used or not...

If the new bdi setup fails, I wonder if the mount should just fail
entirely then. That seems better to me than letting it succeed with
strictlimiting enforced, especially since large folios will be enabled
for fuse iomap. [1] has some numbers for the performance degradations
I saw for writes with strictlimiting on and large folios enabled.

Speaking of strictlimiting though, from a policy standpoint if we
think strictlimiting is needed in general in fuse (there's a thread
from last year [1] about removing strict limiting), then I think that
would need to apply to iomap as well, at least for unprivileged
servers.

[1] https://lore.kernel.org/linux-fsdevel/CAJnrk1bwat_r4+pmhaWH-ThAi+zoAJFwmJG65ANj1Zv0O0s4_A@mail.gmail.com/
[2] https://lore.kernel.org/linux-fsdevel/20251010150113.GC6174@frogsfrogsfrogs/T/#ma34ff5ae338a83f8b2e946d7e5332ea835fa0ff6

>
> > This is more of a nit, but I think it'd also be nice if we
> > swapped the ordering of this patch with the previous one enabling
> > large folios, so that large folios gets enabled only when all the bdi
> > stuff for it is ready.
>
> Will do, thanks for reading these patches!
>
> Also note that I've changed this part of the patchset quite a lot since
> this posting; iomap configuration is now a completely separate fuse
> command that gets triggered after the FUSE_INIT reply is received.

Great, I'll look at your upstream tree then for this part.

Thanks,
Joanne

>
> --D
>
> > Thanks,
> > Joanne
> >
> > >
> > >         /*
> > >          * Enable syncfs for iomap fuse servers so that we can send a final
> > >
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ