lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zod3ZQizBL7MyWEA@dread.disaster.area>
Date: Fri, 5 Jul 2024 14:32:37 +1000
From: Dave Chinner <david@...morbit.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Ryan Roberts <ryan.roberts@....com>,
	"Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>,
	chandan.babu@...cle.com, djwong@...nel.org, brauner@...nel.org,
	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	yang@...amperecomputing.com, linux-mm@...ck.org,
	john.g.garry@...cle.com, linux-fsdevel@...r.kernel.org,
	hare@...e.de, p.raghav@...sung.com, mcgrof@...nel.org,
	gost.dev@...sung.com, cl@...amperecomputing.com,
	linux-xfs@...r.kernel.org, hch@....de, Zi Yan <zi.yan@...t.com>
Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes

On Fri, Jul 05, 2024 at 12:56:28AM +0100, Matthew Wilcox wrote:
> On Fri, Jul 05, 2024 at 08:06:51AM +1000, Dave Chinner wrote:
> > > > It seems strange to silently clamp these? Presumably for the bs>ps usecase,
> > > > whatever values are passed in are a hard requirement? So wouldn't want them to
> > > > be silently reduced. (Especially given the recent change to reduce the size of
> > > > MAX_PAGECACHE_ORDER to less then PMD size in some cases).
> > > 
> > > Hm, yes.  We should probably make this return an errno.  Including
> > > returning an errno for !IS_ENABLED() and min > 0.
> > 
> > What are callers supposed to do with an error? In the case of
> > setting up a newly allocated inode in XFS, the error would be
> > returned in the middle of a transaction and so this failure would
> > result in a filesystem shutdown.
> 
> I suggest you handle it better than this.  If the device is asking for a
> blocksize > PMD_SIZE, you should fail to mount it.

That's my point: we already do that.

The largest block size we support is 64kB and that's way smaller
than PMD_SIZE on all platforms and we always check for bs > ps 
support at mount time when the filesystem bs > ps.

Hence we're never going to set the min value to anything unsupported
unless someone makes a massive programming mistake. At which point,
we want a *hard, immediate fail* so the developer notices their
mistake immediately. All filesystems and block devices need to
behave this way so the limits should be encoded as asserts in the
function to trigger such behaviour.

> If the device is
> asking for a blocksize > PAGE_SIZE and CONFIG_TRANSPARENT_HUGEPAGE is
> not set, you should also decline to mount the filesystem.

What does CONFIG_TRANSPARENT_HUGEPAGE have to do with filesystems
being able to use large folios?

If that's an actual dependency of using large folios, then we're at
the point where the mm side of large folios needs to be divorced
from CONFIG_TRANSPARENT_HUGEPAGE and always supported.
Alternatively, CONFIG_TRANSPARENT_HUGEPAGE needs to selected by the
block layer and also every filesystem that wants to support
sector/blocks sizes larger than PAGE_SIZE.  IOWs, large folio
support needs to *always* be enabled on systems that say
CONFIG_BLOCK=y.

I'd much prefer the former occurs, because making the block layer
and filesystems dependent on an mm feature they don't actually use
is kinda weird...

-Dave.

-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ