lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf9adefc-8508-49a4-a7f0-784e345c5d80@redhat.com>
Date: Wed, 31 Jan 2024 11:21:45 +0100
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>, linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
 Matthew Wilcox <willy@...radead.org>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
 Nick Piggin <npiggin@...il.com>, Peter Zijlstra <peterz@...radead.org>,
 Michael Ellerman <mpe@...erman.id.au>,
 Christophe Leroy <christophe.leroy@...roup.eu>,
 "Naveen N. Rao" <naveen.n.rao@...ux.ibm.com>,
 Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
 Alexander Gordeev <agordeev@...ux.ibm.com>,
 Christian Borntraeger <borntraeger@...ux.ibm.com>,
 Sven Schnelle <svens@...ux.ibm.com>, Arnd Bergmann <arnd@...db.de>,
 linux-arch@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
 linux-s390@...r.kernel.org
Subject: Re: [PATCH v1 9/9] mm/memory: optimize unmap/zap with PTE-mapped THP


>> +
>> +#ifndef clear_full_ptes
>> +/**
>> + * clear_full_ptes - Clear PTEs that map consecutive pages of the same folio.
> 
> I know its implied from "pages of the same folio" (and even more so for the
> above variant due to mention of access/dirty), but I wonder if its useful to
> explicitly state that "all ptes being cleared are present at the time of the call"?

"Clear PTEs" -> "Clear present PTEs" ?

That should make it clearer.

[...]

>>   	if (!delay_rmap) {
>> -		folio_remove_rmap_pte(folio, page, vma);
>> +		folio_remove_rmap_ptes(folio, page, nr, vma);
>> +
>> +		/* Only sanity-check the first page in a batch. */
>>   		if (unlikely(page_mapcount(page) < 0))
>>   			print_bad_pte(vma, addr, ptent, page);
> 
> Is there a case for either removing this all together or moving it into
> folio_remove_rmap_ptes()? It seems odd to only check some pages.
> 

I really wanted to avoid another nasty loop here.

In my thinking, for 4k folios, or when zapping subpages of large folios, 
we still perform the exact same checks. Only when batching we don't. So 
if there is some problem, there are ways to get it triggered. And these 
problems are barely ever seen.

folio_remove_rmap_ptes() feels like the better place -- especially 
because the delayed-rmap handling is effectively unchecked. But in 
there, we cannot "print_bad_pte()".

[background: if we had a total mapcount -- iow cheap folio_mapcount(), 
I'd check here that the total mapcount does not underflow, instead of 
checking per-subpage]

> 
>>   	}
>> -	if (unlikely(__tlb_remove_page(tlb, page, delay_rmap))) {
>> +	if (unlikely(__tlb_remove_folio_pages(tlb, page, nr, delay_rmap))) {
>>   		*force_flush = true;
>>   		*force_break = true;
>>   	}
>>   }
>>   
>> -static inline void zap_present_pte(struct mmu_gather *tlb,
>> +/*
>> + * Zap or skip one present PTE, trying to batch-process subsequent PTEs that map
> 
> Zap or skip *at least* one... ?

Ack

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ