[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <424115a2-a924-4c28-8027-32db6ab9278d@arm.com>
Date: Wed, 31 Jan 2024 10:31:51 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: David Hildenbrand <david@...hat.com>, linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
Nick Piggin <npiggin@...il.com>, Peter Zijlstra <peterz@...radead.org>,
Michael Ellerman <mpe@...erman.id.au>,
Christophe Leroy <christophe.leroy@...roup.eu>,
"Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Arnd Bergmann <arnd@...db.de>,
linux-arch@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-s390@...r.kernel.org
Subject: Re: [PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP
On 31/01/2024 10:21, David Hildenbrand wrote:
>
>>> +
>>> +#ifndef clear_full_ptes
>>> +/**
>>> + * clear_full_ptes - Clear PTEs that map consecutive pages of the same folio.
>>
>> I know its implied from "pages of the same folio" (and even more so for the
>> above variant due to mention of access/dirty), but I wonder if its useful to
>> explicitly state that "all ptes being cleared are present at the time of the
>> call"?
>
> "Clear PTEs" -> "Clear present PTEs" ?
>
> That should make it clearer.
Works for me.
>
> [...]
>
>>> if (!delay_rmap) {
>>> - folio_remove_rmap_pte(folio, page, vma);
>>> + folio_remove_rmap_ptes(folio, page, nr, vma);
>>> +
>>> + /* Only sanity-check the first page in a batch. */
>>> if (unlikely(page_mapcount(page) < 0))
>>> print_bad_pte(vma, addr, ptent, page);
>>
>> Is there a case for either removing this all together or moving it into
>> folio_remove_rmap_ptes()? It seems odd to only check some pages.
>>
>
> I really wanted to avoid another nasty loop here.
>
> In my thinking, for 4k folios, or when zapping subpages of large folios, we
> still perform the exact same checks. Only when batching we don't. So if there is
> some problem, there are ways to get it triggered. And these problems are barely
> ever seen.
>
> folio_remove_rmap_ptes() feels like the better place -- especially because the
> delayed-rmap handling is effectively unchecked. But in there, we cannot
> "print_bad_pte()".
>
> [background: if we had a total mapcount -- iow cheap folio_mapcount(), I'd check
> here that the total mapcount does not underflow, instead of checking per-subpage]
All good points... perhaps extend the comment to describe how this could be
solved in future with cheap total_mapcount()? Or in the commit log if you prefer?
>
>>
>>> }
>>> - if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) {
>>> + if (unlikely(__tlb_remove_folio_pages(tlb, page, nr, delay_rmap))) {
>>> *force_flush = true;
>>> *force_break = true;
>>> }
>>> }
>>> -static inline void zap_present_pte(struct mmu_gather *tlb,
>>> +/*
>>> + * Zap or skip one present PTE, trying to batch-process subsequent PTEs that
>>> map
>>
>> Zap or skip *at least* one... ?
>
> Ack
>
Powered by blists - more mailing lists