linux-kernel - Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240705132418.gk7oeucdisat3sq5@quentin>
Date: Fri, 5 Jul 2024 13:24:18 +0000
From: "Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>
To: Dave Chinner <david@...morbit.com>
Cc: Matthew Wilcox <willy@...radead.org>,
	Ryan Roberts <ryan.roberts@....com>, chandan.babu@...cle.com,
	djwong@...nel.org, brauner@...nel.org, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, yang@...amperecomputing.com,
	linux-mm@...ck.org, john.g.garry@...cle.com,
	linux-fsdevel@...r.kernel.org, hare@...e.de, p.raghav@...sung.com,
	mcgrof@...nel.org, gost.dev@...sung.com, cl@...amperecomputing.com,
	linux-xfs@...r.kernel.org, hch@....de, Zi Yan <zi.yan@...t.com>
Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes

> > I suggest you handle it better than this.  If the device is asking for a
> > blocksize > PMD_SIZE, you should fail to mount it.
> 
> That's my point: we already do that.
> 
> The largest block size we support is 64kB and that's way smaller
> than PMD_SIZE on all platforms and we always check for bs > ps 
> support at mount time when the filesystem bs > ps.
> 
> Hence we're never going to set the min value to anything unsupported
> unless someone makes a massive programming mistake. At which point,
> we want a *hard, immediate fail* so the developer notices their
> mistake immediately. All filesystems and block devices need to
> behave this way so the limits should be encoded as asserts in the
> function to trigger such behaviour.

I agree, this kind of bug will be encountered only during developement 
and not during actual production due to the limit we have fs block size
in XFS.

> 
> > If the device is
> > asking for a blocksize > PAGE_SIZE and CONFIG_TRANSPARENT_HUGEPAGE is
> > not set, you should also decline to mount the filesystem.
> 
> What does CONFIG_TRANSPARENT_HUGEPAGE have to do with filesystems
> being able to use large folios?
> 
> If that's an actual dependency of using large folios, then we're at
> the point where the mm side of large folios needs to be divorced
> from CONFIG_TRANSPARENT_HUGEPAGE and always supported.
> Alternatively, CONFIG_TRANSPARENT_HUGEPAGE needs to selected by the
> block layer and also every filesystem that wants to support
> sector/blocks sizes larger than PAGE_SIZE.  IOWs, large folio
> support needs to *always* be enabled on systems that say
> CONFIG_BLOCK=y.

Why CONFIG_BLOCK? I think it is enough if it comes from the FS side
right? And for now, the only FS that needs that sort of bs > ps 
guarantee is XFS with this series. Other filesystems such as bcachefs 
that call mapping_set_large_folios() only enable it as an optimization
and it is not needed for the filesystem to function.

So this is my conclusion from the conversation:
- Add a dependency in Kconfig on THP for XFS until we fix the dependency
  of large folios on THP
- Add a BUILD_BUG_ON(XFS_MAX_BLOCKSIZE > MAX_PAGECACHE_ORDER)
- Add a WARN_ON_ONCE() and clamp the min and max value in
  mapping_set_folio_order_range() ?

Let me know what you all think @willy, @dave and @ryan.

--
Pankaj