[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6fe2dd45-d634-facc-bb8a-3052ab463abb@redhat.com>
Date: Thu, 3 Aug 2023 16:21:41 +0200
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Yin Fengwei <fengwei.yin@...el.com>,
Yu Zhao <yuzhao@...gle.com>, Yang Shi <shy828301@...il.com>,
"Huang, Ying" <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
Nathan Chancellor <nathan@...nel.org>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v4 3/3] mm: Batch-zap large anonymous folio PTE mappings
On 03.08.23 16:15, Ryan Roberts wrote:
> On 03/08/2023 15:10, David Hildenbrand wrote:
>>>>
>>>> With this patch, you'll might suddenly have mapcount > refcount for a folio, or
>>>> am I wrong?
>>>
>>> Yes you would. Does that break things?
>>>
>>
>> It is problematic whenever you want to check for additional page references that
>> are not from mappings (i.e., GUP refs/pins or anything else)
>>
>> One example lives in KSM code (!compound only):
>>
>> page_mapcount(page) + 1 + swapped != page_count(page)
>>
>> Another one in compaction code:
>>
>> if (!mapping && (folio_ref_count(folio) - 1) > folio_mapcount(folio))
>>
>> And another one in khugepaged (is_refcount_suitable)
>>
>> ... and in THP split can_split_folio() (although that can deal with false
>> positives and false negatives).
>>
>>
>> We want to avoid detecting "no other references" if there *are* other
>> references. Detecting "there are other references" although there are not is
>> usually better.
>>
>>
>> Assume you have mapcount > refcount for some time due to concurrent unmapping,
>> AND some unrelated reference. You would suddenly pass these checks (mapcount ==
>> refcount) and might not detect other references.
>
> OK. I'll rework with the 2 loop approach, assuming I can calculate the number of
> free slots in the mmu_gather ahead of time.
Note that I think some of these checks might be racy in corner cases
(and we should most probably make them more reliable by enforcing the
memory order -- especially a single atomic total_mapcount might help).
But for now, at least the idea that seems to work is that
a) When you map the page, you first increment the refcount, then the
mapcount
b) When you unmap a page, you first decrement the mapcount, then the
refcount
So refcount >= mapcount should in theory always hold when taking a
snapshot of both values. Although if the actual sides that check for
these additional references might deserve some fine-tuning.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists