lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJnrk1Zu4WSr_YnZNa7Zf8s8-wAWTqL_xQyZhFSU6jCZMeryFg@mail.gmail.com>
Date: Mon, 13 Oct 2025 16:32:32 -0700
From: Joanne Koong <joannelkoong@...il.com>
To: Bernd Schubert <bernd@...ernd.com>
Cc: Miklos Szeredi <miklos@...redi.hu>, lu gu <giveme.gulu@...il.com>, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Brian Foster <bfoster@...hat.com>
Subject: Re: [PATCH 5.15] fuse: Fix race condition in writethrough path A race

On Mon, Oct 13, 2025 at 1:40 PM Bernd Schubert <bernd@...ernd.com> wrote:
>
> On 10/13/25 22:27, Joanne Koong wrote:
> > On Mon, Oct 13, 2025 at 1:16 PM Bernd Schubert <bernd@...ernd.com> wrote:
> >>
> >> On 10/13/25 15:39, Miklos Szeredi wrote:
> >>> On Fri, 10 Oct 2025 at 10:46, Miklos Szeredi <miklos@...redi.hu> wrote:
> >>>
> >>>> My idea is to introduce FUSE_I_MTIME_UNSTABLE (which would work
> >>>> similarly to FUSE_I_SIZE_UNSTABLE) and when fetching old_mtime, verify
> >>>> that it hasn't been invalidated.  If old_mtime is invalid or if
> >>>> FUSE_I_MTIME_UNSTABLE signals that a write is in progress, the page
> >>>> cache is not invalidated.
> >>>
> >>> [Adding Brian Foster, the author of FUSE_AUTO_INVAL_DATA patches.
> >>> Link to complete thread:
> >>> https://lore.kernel.org/all/20251009110623.3115511-1-giveme.gulu@gmail.com/#r]
> >>>
> >>> In summary: auto_inval_data invalidates data cache even if the
> >>> modification was done in a cache consistent manner (i.e. write
> >>> through). This is not generally a consistency problem, because the
> >>> backing file and the cache should be in sync.  The exception is when
> >>> the writeback to the backing file hasn't yet finished and a getattr()
> >>> call triggers invalidation (mtime change could be from a previous
> >>> write), and the not yet written data is invalidated and replaced with
> >>> stale data.
> >>>
> >>> The proposed fix was to exclude concurrent reads and writes to the same region.
> >>>
> >>> But the real issue here is that mtime changes triggered by this client
> >>> should not cause data to be invalidated.  It's not only racy, but it's
> >>> fundamentally wrong.  Unfortunately this is hard to do this correctly.
> >>> Best I can come up with is that any request that expects mtime to be
> >>> modified returns the mtime after the request has completed.
> >>>
> >>> This would be much easier to implement in the fuse server: perform the
> >>> "file changed remotely" check when serving a FUSE_GETATTR request and
> >>> return a flag indicating whether the data needs to be invalidated or
> >>> not.
> >>
> >> For an intelligent server maybe, but let's say one uses
> >> <libfuse>/example/passthrough*, in combination with some external writes
> >> to the underlying file system outside of fuse. How would passthrough*
> >> know about external changes?
> >>
> >> The part I don't understand yet is why invalidate_inode_pages2() causes
> >> an issue - it has folio_wait_writeback()?
> >>
> >
> > This issue is for the writethrough path which doesn't use writeback.
>
>
> Oh right. So we need some kind of fuse_invalidate_pages(), that would
> wait for for all current fuse_send_write_pages() to complete? Is that
> what you meant with 'fi->writectr bias'?

Yes but unfortunately this would block reads on any part of the file,
not just the part that's being written.

Looking at this more, I think we actually could just grab the folio
locks and if we need to fault anything in, unlock all the folios,
fault in the bytes, and then relock all the folios. I don't think we would
need to check anything between dropping the locks and relocking, since
if we're comparing this against the existing code where we drop the
locks on fully-copied folios, this doesn't introduce any new race
conditions as far as I can see.

Thanks,
Joanne

>
> Thanks,
> Bernd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ