[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJnrk1aNVMCfTjL0vo-Qki68-5t1W+6-bJHg+x67kHEo_-q0Eg@mail.gmail.com>
Date: Fri, 7 Feb 2025 16:22:56 -0800
From: Joanne Koong <joannelkoong@...il.com>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Josef Bacik <josef@...icpanda.com>, Miklos Szeredi <miklos@...redi.hu>,
Christian Heusel <christian@...sel.eu>, Miklos Szeredi <mszeredi@...hat.com>, regressions@...ts.linux.dev,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
Matthew Wilcox <willy@...radead.org>, linux-mm <linux-mm@...ck.org>,
Mantas Mikulėnas <grawity@...il.com>
Subject: Re: [REGRESSION][BISECTED] Crash with Bad page state for FUSE/Flatpak
related applications since v6.13
On Fri, Feb 7, 2025 at 10:39 AM Vlastimil Babka <vbabka@...e.cz> wrote:
>
> On 2/7/25 18:29, Josef Bacik wrote:
> > On Fri, Feb 07, 2025 at 05:49:34PM +0100, Vlastimil Babka wrote:
> >> On 2/7/25 10:34, Miklos Szeredi wrote:
> >> > [Adding Joanne, Willy and linux-mm].
> >> >
> >> >
> >> > On Thu, 6 Feb 2025 at 11:54, Christian Heusel <christian@...sel.eu> wrote:
> >> >>
> >> >> Hello everyone,
> >> >>
> >> >> we have recently received [a report][0] on the Arch Linux Gitlab about
> >> >> multiple users having system crashes when using Flatpak programs and
> >> >> related FUSE errors in their dmesg logs.
> >> >>
> >> >> We have subsequently bisected the issue within the mainline kernel tree
> >> >> to the following commit:
> >> >>
> >> >> 3eab9d7bc2f4 ("fuse: convert readahead to use folios")
> >>
> >> I see that commit removes folio_put() from fuse_readpages_end(). Also it now
> >> uses readahead_folio() in fuse_readahead() which does folio_put(). So that's
> >> suspicious to me. It might be storing pointers to pages to ap->pages without
> >> pinning them with a refcount.
> >>
> >> But I don't understand the code enough to know what's the proper fix. A
> >> probably stupid fix would be to use __readahead_folio() instead and keep the
> >> folio_put() in fuse_readpages_end().
> >
> > Agreed, I'm also confused as to what the right thing is here. It appears the
> > rules are "if the folio is locked, nobody messes with it", so it's not "correct"
> > to hold a reference on the folio while it's being read. I don't love this way
> > of dealing with folios, but that seems to be the way it's always worked.
> >
> > I went and looked at a few of the other file systems and we have NFS which holds
> > it's own reference to the folio while the IO is outstanding, which FUSE is most
> > similar to NFS so this would make sense to do.
> >
> > Btrfs however doesn't do this, BUT we do set_folio_private (or whatever it's
> > called) so that keeps us from being reclaimed since we'll try to lock the folio
> > before we do the reclaim.
> >
> > So perhaps that's the issue here? We need to have a private on the folio + the
> > folio locked to make sure it doesn't get reclaimed while it's out being read?
> >
> > I'm knee deep in other things, if we want a quick fix then I think your
> > suggestion is correct Vlastimil. But I definitely want to know what Willy
> > expects to be the proper order of operations here, and if this is exactly what
> > we're supposed to be doing then something else is going wrong and we should try
> > to reproduce locally and figure out what's happening. Thanks,
>
> Thanks, Josef. I guess we can at least try to confirm we're on the right track.
> Can anyone affected see if this (only compile tested) patch fixes the issue?
> Created on top of 6.13.1.
This fixes the crash for me on 6.14.0-rc1. I ran the repro using
Mantas's instructions for Obfuscate. I was able to trigger the crash
on a clean build and then with this patch, I'm not seeing the crash
anymore.
>
> ----8<----
> From c0fdf9174f6c17c93a709606384efe2877a3a596 Mon Sep 17 00:00:00 2001
> From: Vlastimil Babka <vbabka@...e.cz>
> Date: Fri, 7 Feb 2025 19:35:25 +0100
> Subject: [PATCH] fuse: prevent folio use-after-free in readahead
>
> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
> ---
> fs/fuse/file.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 7d92a5479998..a40d65ffb94d 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -955,8 +955,10 @@ static void fuse_readpages_end(struct fuse_mount *fm, struct fuse_args *args,
> fuse_invalidate_atime(inode);
> }
>
> - for (i = 0; i < ap->num_folios; i++)
> + for (i = 0; i < ap->num_folios; i++) {
> folio_end_read(ap->folios[i], !err);
> + folio_put(ap->folios[i]);
> + }
> if (ia->ff)
> fuse_file_put(ia->ff, false);
>
> @@ -1048,7 +1050,7 @@ static void fuse_readahead(struct readahead_control *rac)
> ap = &ia->ap;
>
> while (ap->num_folios < cur_pages) {
> - folio = readahead_folio(rac);
> + folio = __readahead_folio(rac);
> ap->folios[ap->num_folios] = folio;
> ap->descs[ap->num_folios].length = folio_size(folio);
> ap->num_folios++;
> --
> 2.48.1
>
>
Powered by blists - more mailing lists