lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <81298bd1-e630-4940-ae5b-7882576b6bf4@suse.cz>
Date: Mon, 10 Feb 2025 09:27:32 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Joanne Koong <joannelkoong@...il.com>,
 Matthew Wilcox <willy@...radead.org>
Cc: Josef Bacik <josef@...icpanda.com>, Miklos Szeredi <miklos@...redi.hu>,
 Christian Heusel <christian@...sel.eu>, Miklos Szeredi
 <mszeredi@...hat.com>, regressions@...ts.linux.dev,
 linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
 linux-mm <linux-mm@...ck.org>, Mantas Mikulėnas
 <grawity@...il.com>
Subject: Re: [REGRESSION][BISECTED] Crash with Bad page state for FUSE/Flatpak
 related applications since v6.13

On 2/8/25 16:46, Joanne Koong wrote:
> On Sat, Feb 8, 2025 at 2:11 AM Matthew Wilcox <willy@...radead.org> wrote:
>>
>> On Fri, Feb 07, 2025 at 04:22:56PM -0800, Joanne Koong wrote:
>> > > Thanks, Josef. I guess we can at least try to confirm we're on the right track.
>> > > Can anyone affected see if this (only compile tested) patch fixes the issue?
>> > > Created on top of 6.13.1.
>> >
>> > This fixes the crash for me on 6.14.0-rc1. I ran the repro using
>> > Mantas's instructions for Obfuscate. I was able to trigger the crash
>> > on a clean build and then with this patch, I'm not seeing the crash
>> > anymore.
>>
>> Since this patch fixes the bug, we're looking for one call to folio_put()
>> too many.  Is it possibly in fuse_try_move_page()?  In particular, this
>> one:
>>
>>         /* Drop ref for ap->pages[] array */
>>         folio_put(oldfolio);
>>
>> I don't know fuse very well.  Maybe this isn't it.
> 
> Yeah, this looks it to me. We don't grab a folio reference for the
> ap->pages[] array for readahead and it tracks with Mantas's
> fuse_dev_splice_write() dmesg. this patch fixed the crash for me when
> I tested it yesterday:
> 
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 7d92a5479998..172cab8e2caf 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -955,8 +955,10 @@ static void fuse_readpages_end(struct fuse_mount
> *fm, struct fuse_args *args,
>                 fuse_invalidate_atime(inode);
>         }
> 
> -       for (i = 0; i < ap->num_folios; i++)
> +       for (i = 0; i < ap->num_folios; i++) {
>                 folio_end_read(ap->folios[i], !err);
> +               folio_put(ap->folios[i]);
> +       }
>         if (ia->ff)
>                 fuse_file_put(ia->ff, false);
> 
> @@ -1049,6 +1051,7 @@ static void fuse_readahead(struct readahead_control *rac)
> 
>                 while (ap->num_folios < cur_pages) {
>                         folio = readahead_folio(rac);
> +                       folio_get(folio);

This is almost the same as my patch, but balances the folio_put() in
readahead_folio() with another folio_get(), while mine uses
__readahead_folio() that does not do folio_put() in the first place.

But I think neither patch proves the extraneous folio_put() comes from
fuse_try_move_page().

>                         ap->folios[ap->num_folios] = folio;
>                         ap->descs[ap->num_folios].length = folio_size(folio);
>                         ap->num_folios++;
> 
> 
> I reran it just now with a printk by that ref drop in
> fuse_try_move_page() and I'm indeed seeing that path get hit.

It might get hit, but is it hit in the readahead paths? One way to test
would be to instead of yours above or mine change, to stop doing the
foio_put() in fuse_try_move_page(). But maybe it's called also from other
contexts that do expect it, and will leak memory otherwise.

> Not sure why fstests didn't pick this up though since splice is
> enabled by default in passthrough_hp, i'll look into this next week.
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ