lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d7e1e877-80ad-48ce-b11e-2c60e951ec8b@intel.com>
Date: Wed, 31 Jan 2024 18:43:38 +0800
From: "Yin, Fengwei" <fengwei.yin@...el.com>
To: David Hildenbrand <david@...hat.com>, <linux-kernel@...r.kernel.org>
CC: <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>, "Matthew
 Wilcox" <willy@...radead.org>, Ryan Roberts <ryan.roberts@....com>, "Catalin
 Marinas" <catalin.marinas@....com>, Will Deacon <will@...nel.org>, "Aneesh
 Kumar K.V" <aneesh.kumar@...ux.ibm.com>, Nick Piggin <npiggin@...il.com>,
	Peter Zijlstra <peterz@...radead.org>, Michael Ellerman <mpe@...erman.id.au>,
	Christophe Leroy <christophe.leroy@...roup.eu>, "Naveen N. Rao"
	<naveen.n.rao@...ux.ibm.com>, Heiko Carstens <hca@...ux.ibm.com>, "Vasily
 Gorbik" <gor@...ux.ibm.com>, Alexander Gordeev <agordeev@...ux.ibm.com>,
	Christian Borntraeger <borntraeger@...ux.ibm.com>, Sven Schnelle
	<svens@...ux.ibm.com>, Arnd Bergmann <arnd@...db.de>,
	<linux-arch@...r.kernel.org>, <linuxppc-dev@...ts.ozlabs.org>,
	<linux-s390@...r.kernel.org>
Subject: Re: [PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP



On 1/31/2024 6:30 PM, David Hildenbrand wrote:
> On 31.01.24 03:30, Yin Fengwei wrote:
>>
>>
>> On 1/29/24 22:32, David Hildenbrand wrote:
>>> +static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm,
>>> +        unsigned long addr, pte_t *ptep, unsigned int nr, int full)
>>> +{
>>> +    pte_t pte, tmp_pte;
>>> +
>>> +    pte = ptep_get_and_clear_full(mm, addr, ptep, full);
>>> +    while (--nr) {
>>> +        ptep++;
>>> +        addr += PAGE_SIZE;
>>> +        tmp_pte = ptep_get_and_clear_full(mm, addr, ptep, full);
>>> +        if (pte_dirty(tmp_pte))
>>> +            pte = pte_mkdirty(pte);
>>> +        if (pte_young(tmp_pte))
>>> +            pte = pte_mkyoung(pte);
>> I am wondering whether it's worthy to move the pte_mkdirty() and 
>> pte_mkyoung()
>> out of the loop and just do it one time if needed. The worst case is 
>> that they
>> are called nr - 1 time. Or it's just too micro?
> 
> I also thought about just indicating "any_accessed" or "any_dirty" using 
> flags to the caller, to avoid the PTE modifications completely. Felt a 
> bit micro-optimized.
> 
> Regarding your proposal: I thought about that as well, but my assumption 
> was that dirty+young are "cheap" to be set.
> 
> On x86, pte_mkyoung() is setting _PAGE_ACCESSED.
> pte_mkdirty() is setting _PAGE_DIRTY | _PAGE_SOFT_DIRTY, but it also has 
> to handle the saveddirty handling, using some bit trickery.
> 
> So at least for pte_mkyoung() there would be no real benefit as far as I 
> can see (might be even worse). For pte_mkdirty() there might be a small 
> benefit.
> 
> Is it going to be measurable? Likely not.
Yeah. We can do more investigation when performance profiling call this
out.


Regards
Yin, Fengwei

> 
> Am I missing something?
> 
> Thanks!
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ