linux-kernel - Re: [PATCH 2/7] mm/gup: check ref_count instead of lru before migration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <919ab2ee-6493-4415-a75b-e1a2b08c0d3e@redhat.com>
Date: Mon, 8 Sep 2025 22:17:21 +0200
From: David Hildenbrand <david@...hat.com>
To: Hugh Dickins <hughd@...gle.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Will Deacon <will@...nel.org>,
 Shivank Garg <shivankg@....com>, Matthew Wilcox <willy@...radead.org>,
 Christoph Hellwig <hch@...radead.org>, Keir Fraser <keirf@...gle.com>,
 Jason Gunthorpe <jgg@...pe.ca>, John Hubbard <jhubbard@...dia.com>,
 Frederick Mayle <fmayle@...gle.com>, Peter Xu <peterx@...hat.com>,
 "Aneesh Kumar K.V" <aneesh.kumar@...nel.org>,
 Johannes Weiner <hannes@...xchg.org>, Vlastimil Babka <vbabka@...e.cz>,
 Alexander Krabler <Alexander.Krabler@...a.com>, Ge Yang
 <yangge1116@....com>, Li Zhe <lizhe.67@...edance.com>,
 Chris Li <chrisl@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
 Axel Rasmussen <axelrasmussen@...gle.com>, Yuanchu Xie <yuanchu@...gle.com>,
 Wei Xu <weixugc@...gle.com>, Konstantin Khlebnikov <koct9i@...il.com>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 2/7] mm/gup: check ref_count instead of lru before
 migration

On 08.09.25 21:57, Hugh Dickins wrote:
> On Mon, 8 Sep 2025, David Hildenbrand wrote:
>> On 08.09.25 12:40, Hugh Dickins wrote:
>>> On Mon, 1 Sep 2025, David Hildenbrand wrote:
>>>> On 31.08.25 11:05, Hugh Dickins wrote:
>>>>> diff --git a/mm/gup.c b/mm/gup.c
>>>>> index adffe663594d..82aec6443c0a 100644
>>>>> --- a/mm/gup.c
>>>>> +++ b/mm/gup.c
>>>>> @@ -2307,7 +2307,8 @@ static unsigned long
>>>>> collect_longterm_unpinnable_folios(
>>>>>       	continue;
>>>>>       }
>>>>>     -		if (!folio_test_lru(folio) && drain_allow) {
>>>>> +		if (drain_allow && folio_ref_count(folio) !=
>>>>> +				   folio_expected_ref_count(folio) + 1) {
>>>>>        lru_add_drain_all();
>>>>>        drain_allow = false;
>>>>>       }
>>>>
>>>> In general, to the fix idea
>>>>
>>>>   Acked-by: David Hildenbrand <david@...hat.com>
>>>
>>> Thanks, but I'd better not assume that in v2, even though code the same.
>>> Will depend on how you feel about added paragraph in v2 commit message.
>>>
>>>>
>>>> But as raised in reply to patch #1, we have to be a bit careful about
>>>> including private_2 in folio_expected_ref_count() at this point.
>>>>
>>>> If we cannot include it in folio_expected_ref_count(), it's all going to be
>>>> a
>>>> mess until PG_private_2 is removed for good.
>>>>
>>>> So that part still needs to be figured out.
>>>
>>> Here's that added paragraph:
>>>
>>> Note on PG_private_2: ceph and nfs are still using the deprecated
>>> PG_private_2 flag, with the aid of netfs and filemap support functions.
>>> Although it is consistently matched by an increment of folio ref_count,
>>> folio_expected_ref_count() intentionally does not recognize it, and ceph
>>> folio migration currently depends on that for PG_private_2 folios to be
>>> rejected.  New references to the deprecated flag are discouraged, so do
>>> not add it into the collect_longterm_unpinnable_folios() calculation:
>>> but longterm pinning of transiently PG_private_2 ceph and nfs folios
>>> (an uncommon case) may invoke a redundant lru_add_drain_all().
>>
>> Would we also loop forever trying to migrate these folios if they reside on
>> ZONE_MOVABLE? I would assume that is already the case, that migration will
>> always fail due to the raised reference.
> 
> Loop around forever?  That would be unfortunate (but I presume killable).
> But when I looked, it appeared that any failure of migrate_pages() there
> gets reported as -ENOMEM, which would end up as an OOM?  But you know
> mm/gup.c very much better than I do.

Yes, like I expected, we just bail out. __gup_longterm_locked() will not 
retry in that case. It's interesting that any migration failure is 
treated as -ENOMEM, but well, that's certainly material for a completely 
different discussion.

Thanks Hugh!

-- 
Cheers

David / dhildenb