lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0a689e9f-082b-497d-a32b-afc3feddcdb8@redhat.com>
Date: Wed, 30 Jul 2025 11:30:20 +0200
From: David Hildenbrand <david@...hat.com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>, akpm@...ux-foundation.org,
 hughd@...gle.com
Cc: willy@...radead.org, lorenzo.stoakes@...cle.com, ziy@...dia.com,
 Liam.Howlett@...cle.com, npache@...hat.com, ryan.roberts@....com,
 dev.jain@....com, baohua@...nel.org, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] mm: shmem: fix the strategy for the tmpfs 'huge='
 options

On 30.07.25 10:14, Baolin Wang wrote:
> After commit acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs"),
> we have extended tmpfs to allow any sized large folios, rather than just
> PMD-sized large folios.
> 
> The strategy discussed previously was:
> 
> "
> Considering that tmpfs already has the 'huge=' option to control the
> PMD-sized large folios allocation, we can extend the 'huge=' option to
> allow any sized large folios.  The semantics of the 'huge=' mount option
> are:
> 
>      huge=never: no any sized large folios
>      huge=always: any sized large folios
>      huge=within_size: like 'always' but respect the i_size
>      huge=advise: like 'always' if requested with madvise()
> 
> Note: for tmpfs mmap() faults, due to the lack of a write size hint, still
> allocate the PMD-sized huge folios if huge=always/within_size/advise is
> set.
> 
> Moreover, the 'deny' and 'force' testing options controlled by
> '/sys/kernel/mm/transparent_hugepage/shmem_enabled', still retain the same
> semantics.  The 'deny' can disable any sized large folios for tmpfs, while
> the 'force' can enable PMD sized large folios for tmpfs.
> "
> 
> This means that when tmpfs is mounted with 'huge=always' or 'huge=within_size',
> tmpfs will allow getting a highest order hint based on the size of write() and
> fallocate() paths. It will then try each allowable large order, rather than
> continually attempting to allocate PMD-sized large folios as before.
> 
> However, this might break some user scenarios for those who want to use
> PMD-sized large folios, such as the i915 driver which did not supply a write
> size hint when allocating shmem [1].
> 
> Moreover, Hugh also complained that this will cause a regression in userspace
> with 'huge=always' or 'huge=within_size'.
> 
> So, let's revisit the strategy for tmpfs large page allocation. A simple fix
> would be to always try PMD-sized large folios first, and if that fails, fall
> back to smaller large folios. However, this approach differs from the strategy
> for large folio allocation used by other file systems. Is this acceptable?

My opinion so far has been that anon and shmem are different than 
ordinary FS'es ... primarily because 
allocation(readahead)+reclaim(writeback) behave differently.

There were opinions in the past that tmpfs should just behave like any 
other fs, and I think that's what we tried to satisfy here: use the 
write size as an indication.

I assume there will be workloads where either approach will be 
beneficial. I also assume that workloads that use ordinary fs'es could 
benefit from the same strategy (start with PMD), while others will 
clearly not.

So no real opinion, it all doesn't feel ideal ... at least with his 
approach here we would stick more to the old tmpfs behavior.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ