lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aYbmy8JdgXwsGaPP@autotest-wegao.qe.prg2.suse.org>
Date: Sat, 7 Feb 2026 07:16:27 +0000
From: Wei Gao <wegao@...e.com>
To: Sasha Levin <sashal@...nel.org>
Cc: Joanne Koong <joannelkoong@...il.com>, willy@...radead.org,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	wegao@...e.com
Subject: Re: [RFC PATCH 1/1] iomap: fix race between iomap_set_range_uptodate
 and folio_end_read

On Tue, Dec 23, 2025 at 08:31:57PM -0500, Sasha Levin wrote:
> On Tue, Dec 23, 2025 at 05:12:09PM -0800, Joanne Koong wrote:
> > On Tue, Dec 23, 2025 at 2:30 PM Sasha Levin <sashal@...nel.org> wrote:
> > > 
> > 
> > Hi Sasha,
> > 
> > Thanks for your patch and for the detailed writeup.
> 
> Thanks for looking into this!
> 
> > > When iomap uses large folios, per-block uptodate tracking is managed via
> > > iomap_folio_state (ifs). A race condition can cause the ifs uptodate bits
> > > to become inconsistent with the folio's uptodate flag.
> > > 
> > > The race occurs because folio_end_read() uses XOR semantics to atomically
> > > set the uptodate bit and clear the locked bit:
> > > 
> > >   Thread A (read completion):          Thread B (concurrent write):
> > >   --------------------------------     --------------------------------
> > >   iomap_finish_folio_read()
> > >     spin_lock(state_lock)
> > >     ifs_set_range_uptodate() -> true
> > >     spin_unlock(state_lock)
> > >                                        iomap_set_range_uptodate()
> > >                                          spin_lock(state_lock)
> > >                                          ifs_set_range_uptodate() -> true
> > >                                          spin_unlock(state_lock)
> > >                                          folio_mark_uptodate(folio)
> > >     folio_end_read(folio, true)
> > >       folio_xor_flags()  // XOR CLEARS uptodate!
> > 
> > The part I'm confused about here is how this can happen between a
> > concurrent read and write. My understanding is that the folio is
> > locked when the read occurs and locked when the write occurs and both
> > locks get dropped only when the read or write finishes. Looking at
> > iomap code, I see iomap_set_range_uptodate() getting called in
> > __iomap_write_begin() and __iomap_write_end() for the writes, but in
> > both those places the folio lock is held while this is called. I'm not
> > seeing how the read and write race in the diagram can happen, but
> > maybe I'm missing something here?
> 
> Hmm, you're right... The folio lock should prevent concurrent read/write
> access. Looking at this again, I suspect that FUSE was calling
> folio_clear_uptodate() and folio_mark_uptodate() directly without updating the
> ifs bits. For example, in fuse_send_write_pages() on write error, it calls
> folio_clear_uptodate(folio) which clears the folio flag but leaves ifs still
> showing all blocks uptodate?

Hi Sasha
On PowerPC with 64KB page size, msync04 fails with SIGBUS on NTFS-FUSE. The issue stems from a state inconsistency between
the iomap_folio_state (ifs) bitmap and the folio's Uptodate flag.
tst_test.c:1985: TINFO: === Testing on ntfs ===
tst_test.c:1290: TINFO: Formatting /dev/loop0 with ntfs opts='' extra opts=''
Failed to set locale, using default 'C'.
The partition start sector was not specified for /dev/loop0 and it could not be obtained automatically.  It has been set to 0.
The number of sectors per track was not specified for /dev/loop0 and it could not be obtained automatically.  It has been set to 0.
The number of heads was not specified for /dev/loop0 and it could not be obtained automatically.  It has been set to 0.
To boot from a device, Windows needs the 'partition start sector', the 'sectors per track' and the 'number of heads' to be set.
Windows will not be able to boot from this device.
tst_test.c:1302: TINFO: Mounting /dev/loop0 to /tmp/LTP_msy3ljVxi/msync04 fstyp=ntfs flags=0
tst_test.c:1302: TINFO: Trying FUSE...
tst_test.c:1953: TBROK: Test killed by SIGBUS!

Root Cause Analysis: When a page fault triggers fuse_read_folio, the iomap_read_folio_iter handles the request. For a 64KB page, 
after fetching 4KB via fuse_iomap_read_folio_range_async, the remaining 60KB (61440 bytes) is zero-filled via iomap_block_needs_zeroing, 
then iomap_set_range_uptodate marks the folio as Uptodate globally, after folio_xor_flags folio's uptodate become 0 again, finally trigger 
an SIGBUS issue in filemap_fault.

So your iomap_set_range_uptodate patch can fix above failed case since it block mark folio's uptodate to 1.
Hope my findings are helpful.

> 
> -- 
> Thanks,
> Sasha
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ