linux-kernel - Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5875f5ea-4d83-4691-914c-15834338410e@arm.com>
Date: Tue, 9 Jul 2024 09:11:51 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Dave Chinner <david@...morbit.com>
Cc: "Pankaj Raghav (Samsung)" <kernel@...kajraghav.com>,
 Matthew Wilcox <willy@...radead.org>, chandan.babu@...cle.com,
 djwong@...nel.org, brauner@...nel.org, akpm@...ux-foundation.org,
 linux-kernel@...r.kernel.org, yang@...amperecomputing.com,
 linux-mm@...ck.org, john.g.garry@...cle.com, linux-fsdevel@...r.kernel.org,
 hare@...e.de, p.raghav@...sung.com, mcgrof@...nel.org, gost.dev@...sung.com,
 cl@...amperecomputing.com, linux-xfs@...r.kernel.org, hch@....de,
 Zi Yan <zi.yan@...t.com>
Subject: Re: [PATCH v8 01/10] fs: Allow fine-grained control of folio sizes

On 09/07/2024 00:01, Dave Chinner wrote:
> On Fri, Jul 05, 2024 at 02:31:08PM +0100, Ryan Roberts wrote:
>> On 05/07/2024 14:24, Pankaj Raghav (Samsung) wrote:
>>>>> I suggest you handle it better than this.  If the device is asking for a
>>>>> blocksize > PMD_SIZE, you should fail to mount it.
>>>>
>>>> That's my point: we already do that.
>>>>
>>>> The largest block size we support is 64kB and that's way smaller
>>>> than PMD_SIZE on all platforms and we always check for bs > ps 
>>>> support at mount time when the filesystem bs > ps.
>>>>
>>>> Hence we're never going to set the min value to anything unsupported
>>>> unless someone makes a massive programming mistake. At which point,
>>>> we want a *hard, immediate fail* so the developer notices their
>>>> mistake immediately. All filesystems and block devices need to
>>>> behave this way so the limits should be encoded as asserts in the
>>>> function to trigger such behaviour.
>>>
>>> I agree, this kind of bug will be encountered only during developement 
>>> and not during actual production due to the limit we have fs block size
>>> in XFS.
>>>
>>>>
>>>>> If the device is
>>>>> asking for a blocksize > PAGE_SIZE and CONFIG_TRANSPARENT_HUGEPAGE is
>>>>> not set, you should also decline to mount the filesystem.
>>>>
>>>> What does CONFIG_TRANSPARENT_HUGEPAGE have to do with filesystems
>>>> being able to use large folios?
>>>>
>>>> If that's an actual dependency of using large folios, then we're at
>>>> the point where the mm side of large folios needs to be divorced
>>>> from CONFIG_TRANSPARENT_HUGEPAGE and always supported.
>>>> Alternatively, CONFIG_TRANSPARENT_HUGEPAGE needs to selected by the
>>>> block layer and also every filesystem that wants to support
>>>> sector/blocks sizes larger than PAGE_SIZE.  IOWs, large folio
>>>> support needs to *always* be enabled on systems that say
>>>> CONFIG_BLOCK=y.
>>>
>>> Why CONFIG_BLOCK? I think it is enough if it comes from the FS side
>>> right? And for now, the only FS that needs that sort of bs > ps 
>>> guarantee is XFS with this series. Other filesystems such as bcachefs 
>>> that call mapping_set_large_folios() only enable it as an optimization
>>> and it is not needed for the filesystem to function.
>>>
>>> So this is my conclusion from the conversation:
>>> - Add a dependency in Kconfig on THP for XFS until we fix the dependency
>>>   of large folios on THP
>>
>> THP isn't supported on some arches, so isn't this effectively saying XFS can no
>> longer be used with those arches, even if the bs <= ps?
> 
> I'm good with that - we're already long past the point where we try
> to support XFS on every linux platform. Indeed, we've recent been
> musing about making XFS depend on 64 bit only - 32 bit systems don't
> have the memory capacity to run the full xfs tool chain (e.g.
> xfs_repair) on filesystems over about a TB in size, and they are
> greatly limited in kernel memory and vmap areas, both of which XFS
> makes heavy use of. Basically, friends don't let friends use XFS on
> 32 bit systems, and that's been true for about 20 years now.
> 
> Our problem is the test matrix - if we now have to explicitly test
> XFS both with and without large folios enabled to support these
> platforms, we've just doubled our test matrix. The test matrix is
> already far too large to robustly cover, so anything that requires
> doubling the number of kernel configs we have to test is, IMO, a
> non-starter.
> 
> That's why we really don't support XFS on 32 bit systems anymore and
> why we're talking about making that official with a config option.
> If we're at the point where XFS will now depend on large folios (i.e
> THP), then we need to seriously consider reducing the supported
> arches to just those that support both 64 bit and THP. If niche
> arches want to support THP, or enable large folios without the need
> for THP, then they can do that work and then they get XFS for
> free.
> 
> Just because an arch might run a Linux kernel, it doesn't mean we
> have to support XFS on it....

OK. I was just pointing out the impact of adding this Kconfig dependency. If
that impact is explicitly considered and desired, then great. I'll leave you to it.

Thanks,
Ryan

> 
> -Dave.