lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aUN9BFdxHQj8ThMS@ndev>
Date: Thu, 18 Dec 2025 12:03:28 +0800
From: Jinchao Wang <wangjinchao600@...il.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Christian Brauner <brauner@...nel.org>,
	Hannes Reinecke <hare@...e.de>,
	Luis Chamberlain <mcgrof@...nel.org>, linux-fsdevel@...r.kernel.org,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	stable@...r.kernel.org,
	syzbot+4d3cc33ef7a77041efa6@...kaller.appspotmail.com,
	syzbot+fdba5cca73fee92c69d6@...kaller.appspotmail.com
Subject: Re: [PATCH] mm/readahead: read min folio constraints under
 invalidate lock

On Tue, Dec 16, 2025 at 03:53:17AM +0000, Matthew Wilcox wrote:
> On Tue, Dec 16, 2025 at 11:12:21AM +0800, Jinchao Wang wrote:
> > On Tue, Dec 16, 2025 at 02:42:06AM +0000, Matthew Wilcox wrote:
> > > On Tue, Dec 16, 2025 at 09:37:51AM +0800, Jinchao Wang wrote:
> > > > On Mon, Dec 15, 2025 at 02:22:23PM +0000, Matthew Wilcox wrote:
> > > > > On Mon, Dec 15, 2025 at 10:19:00PM +0800, Jinchao Wang wrote:
> > > > > > page_cache_ra_order() and page_cache_ra_unbounded() read mapping minimum folio
> > > > > > constraints before taking the invalidate lock, allowing concurrent changes to
> > > > > > violate page cache invariants.
> > > > > > 
> > > > > > Move the lookups under filemap_invalidate_lock_shared() to ensure readahead
> > > > > > allocations respect the mapping constraints.
> > > > > 
> > > > > Why are the mapping folio size constraints being changed?  They're
> > > > > supposed to be set at inode instantiation and then never changed.
> > > > 
> > > > They can change after instantiation for block devices. In the syzbot repro:
> > > >   blkdev_ioctl() -> blkdev_bszset() -> set_blocksize() ->
> > > >   mapping_set_folio_min_order()
> > > 
> > > Oh, this is just syzbot doing stupid things.  We should probably make
> > > blkdev_bszset() fail if somebody else has an fd open.
> > 
> > Thanks, that makes sense.
> > Tightening blkdev_bszset() would avoid the race entirely.
> > This change is meant as a defensive fix to prevent BUGs.
> 
> Yes, but the point is that there's a lot of code which relies on
> the AS_FOLIO bits not changing in the middle.  Syzbot found one of them,
> but there are others.

I've been thinking about this more, and I wanted to share another
perspective if that's okay.

Rather than tracking down every place that might change AS_FOLIO bits
(like blkdev_bszset() and potentially others), what if we make the
page cache layer itself robust against such changes?

The invalidate_lock was introduced for exactly this kind of protection
(commit 730633f0b7f9: "mm: Protect operations adding pages to page
cache with invalidate_lock"). This way, the page cache doesn't need
to rely on assumptions about what upper layers might do.

The readahead functions already hold filemap_invalidate_lock_shared(),
so moving the constraint reads under the lock adds no overhead. It
would protect against AS_FOLIO changes regardless of their source.

I think this separates concerns nicely: upper layers can change
constraints through the invalidate_lock protocol, and page cache
operations are automatically safe. But I'd really value your thoughts
on this approach - you have much more experience with these tradeoffs
than I do.

Thanks again for taking the time to discuss this.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ