[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d83309fa-4daa-430f-ae52-4e72162bca9a@redhat.com>
Date: Wed, 31 Jan 2024 11:30:56 +0100
From: David Hildenbrand <david@...hat.com>
To: Yin Fengwei <fengwei.yin@...el.com>, linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>, Ryan Roberts <ryan.roberts@....com>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
Nick Piggin <npiggin@...il.com>, Peter Zijlstra <peterz@...radead.org>,
Michael Ellerman <mpe@...erman.id.au>,
Christophe Leroy <christophe.leroy@...roup.eu>,
"Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Arnd Bergmann <arnd@...db.de>,
linux-arch@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
linux-s390@...r.kernel.org
Subject: Re: [PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP
On 31.01.24 03:30, Yin Fengwei wrote:
>
>
> On 1/29/24 22:32, David Hildenbrand wrote:
>> +static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm,
>> + unsigned long addr, pte_t *ptep, unsigned int nr, int full)
>> +{
>> + pte_t pte, tmp_pte;
>> +
>> + pte = ptep_get_and_clear_full(mm, addr, ptep, full);
>> + while (--nr) {
>> + ptep++;
>> + addr += PAGE_SIZE;
>> + tmp_pte = ptep_get_and_clear_full(mm, addr, ptep, full);
>> + if (pte_dirty(tmp_pte))
>> + pte = pte_mkdirty(pte);
>> + if (pte_young(tmp_pte))
>> + pte = pte_mkyoung(pte);
> I am wondering whether it's worthy to move the pte_mkdirty() and pte_mkyoung()
> out of the loop and just do it one time if needed. The worst case is that they
> are called nr - 1 time. Or it's just too micro?
I also thought about just indicating "any_accessed" or "any_dirty" using
flags to the caller, to avoid the PTE modifications completely. Felt a
bit micro-optimized.
Regarding your proposal: I thought about that as well, but my assumption
was that dirty+young are "cheap" to be set.
On x86, pte_mkyoung() is setting _PAGE_ACCESSED.
pte_mkdirty() is setting _PAGE_DIRTY | _PAGE_SOFT_DIRTY, but it also has
to handle the saveddirty handling, using some bit trickery.
So at least for pte_mkyoung() there would be no real benefit as far as I
can see (might be even worse). For pte_mkdirty() there might be a small
benefit.
Is it going to be measurable? Likely not.
Am I missing something?
Thanks!
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists