linux-kernel - Re: [PATCH 5.15] fuse: Fix race condition in writethrough path A race

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aO_6g9cG1IVvp--D@bfoster>
Date: Wed, 15 Oct 2025 15:48:19 -0400
From: Brian Foster <bfoster@...hat.com>
To: Joanne Koong <joannelkoong@...il.com>
Cc: Miklos Szeredi <miklos@...redi.hu>, lu gu <giveme.gulu@...il.com>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	Bernd Schubert <bernd@...ernd.com>
Subject: Re: [PATCH 5.15] fuse: Fix race condition in writethrough path A race

On Wed, Oct 15, 2025 at 10:19:15AM -0700, Joanne Koong wrote:
> On Wed, Oct 15, 2025 at 7:09 AM Miklos Szeredi <miklos@...redi.hu> wrote:
> >
> > On Wed, 15 Oct 2025 at 06:00, lu gu <giveme.gulu@...il.com> wrote:
> > >
> > > >  Attaching a test patch, minimally tested.
> > > Since I only have a test environment for kernel 5.15, I ported this
> > > patch to the FUSE module in 5.15. I ran the previous LTP test cases
> > > more than ten times, and the data inconsistency issue did not reoccur.
> > > However, a deadlock occur. Below is the specific stack trace.
> >
> > This is does not reproduce for me on 6.17 even after running the test
> > for hours.  Without seeing your backport it is difficult to say
> > anything about the reason for the deadlock.
> >
> > Attaching an updated patch that takes care of i_wb initialization on
> > CONFIG_CGROUP_WRITEBACK=y.
> 
> I think now we'll also need to always set
> mapping_set_writeback_may_deadlock_on_reclaim(), eg
> 
> @@ -3125,8 +3128,7 @@ void fuse_init_file_inode(struct inode *inode,
> unsigned int flags)
> 
>         inode->i_fop = &fuse_file_operations;
>         inode->i_data.a_ops = &fuse_file_aops;
> -       if (fc->writeback_cache)
> -               mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
> +       mapping_set_writeback_may_deadlock_on_reclaim(&inode->i_data);
> 
> 
> Does this completely get rid of the race? There's a fair chance I'm
> wrong here but doesn't the race still happen if the read invalidation
> happens before the write grabs the folio lock? This is the scenario
> I'm thinking of:
> 
> Thread A (read):
> read, w/ auto inval and a outdated mtime triggers invalidate_inode_pages2()
> generic_file_read_iter() is called, which calls filemap_read() ->
> filemap_get_pages() -> triggers read_folio/readahead
> read_folio/readahead fetches data (stale) from the server, unlocks folios
> 
> Thread B (writethrough write):
> fuse_perform_write() -> fuse_fill_write_pages():
> grabs the folio lock and copies new write data to page cache, sets
> writeback flag and unlocks folio, sends request to server
> 
> Thread A (read):
> the read data that was fetched from the server gets copied to the page
> cache in filemap_read()
> overwrites the write data in the page cache with the stale data
> 
> Am i misanalyzing something in this sequence?
> 

Maybe I misread the description, but I think folios are locked across
read I/O, so I don't follow how we could race with readahead in this
way. Hm?

Brian

> Thanks,
> Joanne
> >
> > Thanks,
> > Miklos
>