lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <92def216-ca9c-402d-8643-226592ca1a85@redhat.com>
Date: Mon, 1 Sep 2025 09:52:16 +0200
From: David Hildenbrand <david@...hat.com>
To: Hugh Dickins <hughd@...gle.com>, Matthew Wilcox <willy@...radead.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Will Deacon <will@...nel.org>,
 Shivank Garg <shivankg@....com>, Christoph Hellwig <hch@...radead.org>,
 Keir Fraser <keirf@...gle.com>, Jason Gunthorpe <jgg@...pe.ca>,
 John Hubbard <jhubbard@...dia.com>, Frederick Mayle <fmayle@...gle.com>,
 Peter Xu <peterx@...hat.com>, "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
 Johannes Weiner <hannes@...xchg.org>, Vlastimil Babka <vbabka@...e.cz>,
 Alexander Krabler <Alexander.Krabler@...a.com>, Ge Yang
 <yangge1116@....com>, Li Zhe <lizhe.67@...edance.com>,
 Chris Li <chrisl@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
 Axel Rasmussen <axelrasmussen@...gle.com>, Yuanchu Xie <yuanchu@...gle.com>,
 Wei Xu <weixugc@...gle.com>, Konstantin Khlebnikov <koct9i@...il.com>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 1/7] mm: fix folio_expected_ref_count() when PG_private_2

On 01.09.25 03:17, Hugh Dickins wrote:
> On Mon, 1 Sep 2025, Matthew Wilcox wrote:
>> On Sun, Aug 31, 2025 at 02:01:16AM -0700, Hugh Dickins wrote:
>>> 6.16's folio_expected_ref_count() is forgetting the PG_private_2 flag,
>>> which (like PG_private, but not in addition to PG_private) counts for
>>> 1 more reference: it needs to be using folio_has_private() in place of
>>> folio_test_private().
>>
>> No, it doesn't.  I know it used to, but no filesystem was actually doing
>> that.  So I changed mm to match how filesystems actually worked.
>> I'm not sure if there's still documentation lying around that gets
>> this wrong or if you're remembering how things used to be documented,
>> but it's never how any filesystem has ever worked.
>>
>> We're achingly close to getting rid of PG_private_2.  I think it's just
>> ceph and nfs that still use it.
> 
> I knew you were trying to get rid of it (hurrah! thank you), so when I
> tried porting my lru_add_drainage to 6.12 I was careful to check whether
> folio_expected_ref_count() would need to add it to the accounting there:
> apparently yes; but then I was surprised to find that it's still present
> in 6.17-rc, I'd assumed it gone long ago.
> 
> I didn't try to read the filesystems (which could easily have been
> inconsistent about it) to understand: what convinced me amidst all
> the confusion was this comment and code in mm/filemap.c:
> 
> /**
>   * folio_end_private_2 - Clear PG_private_2 and wake any waiters.
>   * @folio: The folio.
>   *
>   * Clear the PG_private_2 bit on a folio and wake up any sleepers waiting for
>   * it.  The folio reference held for PG_private_2 being set is released.
>   *
>   * This is, for example, used when a netfs folio is being written to a local
>   * disk cache, thereby allowing writes to the cache for the same folio to be
>   * serialised.
>   */
> void folio_end_private_2(struct folio *folio)
> {
> 	VM_BUG_ON_FOLIO(!folio_test_private_2(folio), folio);
> 	clear_bit_unlock(PG_private_2, folio_flags(folio, 0));
> 	folio_wake_bit(folio, PG_private_2);
> 	folio_put(folio);
> }
> EXPORT_SYMBOL(folio_end_private_2);
> 
> That seems to be clear that PG_private_2 is matched by a folio reference,
> but perhaps you can explain it away - worth changing the comment if so.
> 
> I was also anxious to work out whether PG_private with PG_private_2
> would mean +1 or +2: I don't think I found any decisive statement,
> but traditional use of page_has_private() implied +1; and I expect
> there's no filesystem which actually could have both on the same folio.

I think it's "+1", like we used to have.

I was seriously confused when discovering (iow, concerned about false 
positives):

	PG_fscache = PG_private_2,

But in the end PG_fscache is only used in comments and e.g., 
__fscache_clear_page_bits() calls folio_end_private_2(). So both are 
really just aliases.

[Either PG_fscache should be dropped and referred to as PG_private_2, or 
PG_private_2 should be dropped and PG_fscache used instead. It's even 
inconsistently used in that fscache. file.

Or both should be dropped, of course, once we can actually get rid of it 
...]

So PG_private_2 should not be used for any other purpose.

folio_start_private_2() / folio_end_private_2() indeed pair the flag 
with a reference. There are no other callers that would set/clear the 
flag without involving a reference.

The usage of private_2 is declared deprecated all over the place. So the 
question is if we really still care.

The ceph usage is guarded by CONFIG_CEPH_FSCACHE, the NFS one by 
NFS_FSCACHE, nothing really seems to prevent it from getting configured 
in easily.

Now, one problem would be if migration / splitting / ... code where we 
use folio_expected_ref_count() cannot deal with that additional 
reference properly, in which case this patch would indeed cause harm.

If all folio_expected_ref_count() callers can deal with updating that 
reference, all good.

nfs_migrate_folio(), for example, has folio_test_private_2() handling in 
there (just wait until it is gone). ceph handles it during 
ceph_writepages_start(), but uses ordinary filemap_migrate_folio.

Long story short: this patch is problematic if one 
folio_expected_ref_count() users is not aware of how to handle that 
additional reference.

-- 
Cheers

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ