lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 1 Feb 2021 05:58:13 +0000 From: Nadav Amit <namit@...are.com> To: Andrew Cooper <andrew.cooper3@...rix.com> CC: Andy Lutomirski <luto@...nel.org>, Linux-MM <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>, Andrea Arcangeli <aarcange@...hat.com>, Andrew Morton <akpm@...ux-foundation.org>, Dave Hansen <dave.hansen@...ux.intel.com>, Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>, Nick Piggin <npiggin@...il.com>, X86 ML <x86@...nel.org>, Andy Lutomirski <luto@...capital.net> Subject: Re: [RFC 03/20] mm/mprotect: do not flush on permission promotion > On Jan 31, 2021, at 4:10 AM, Andrew Cooper <andrew.cooper3@...rix.com> wrote: > > On 31/01/2021 01:07, Andy Lutomirski wrote: >> Adding Andrew Cooper, who has a distressingly extensive understanding >> of the x86 PTE magic. > > Pretty sure it is all learning things the hard way... > >> On Sat, Jan 30, 2021 at 4:16 PM Nadav Amit <nadav.amit@...il.com> wrote: >>> diff --git a/mm/mprotect.c b/mm/mprotect.c >>> index 632d5a677d3f..b7473d2c9a1f 100644 >>> --- a/mm/mprotect.c >>> +++ b/mm/mprotect.c >>> @@ -139,7 +139,8 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, >>> ptent = pte_mkwrite(ptent); >>> } >>> ptep_modify_prot_commit(vma, addr, pte, oldpte, ptent); >>> - tlb_flush_pte_range(tlb, addr, PAGE_SIZE); >>> + if (pte_may_need_flush(oldpte, ptent)) >>> + tlb_flush_pte_range(tlb, addr, PAGE_SIZE); > > You're choosing to avoid the flush, based on A/D bits read ahead of the > actual modification of the PTE. > > In this example, another thread can write into the range (sets A and D), > and get a suitable TLB entry which goes unflushed while the rest of the > kernel thinks the memory is write-protected and clean. > > The only safe way to do this is to use XCHG/etc to modify the PTE, and > base flush calculations on the results. Atomic operations are ordered > with A/D updates from pagewalks on other CPUs, even on AMD where A > updates are explicitly not ordered with regular memory reads, for > performance reasons. Thanks Andrew for the feedback, but I think the patch does it exactly in this safe manner that you describe (at least on native x86, but I see a similar path elsewhere as well): oldpte = ptep_modify_prot_start() -> __ptep_modify_prot_start() -> ptep_get_and_clear -> native_ptep_get_and_clear() -> xchg() Note that the xchg() will clear the PTE (i.e., making it non-present), and no further updates of A/D are possible until ptep_modify_prot_commit() is called. On non-SMP setups this is not atomic (no xchg), but since we hold the lock, we should be safe. I guess you are right and a pte_may_need_flush() deserves a comment to clarify that oldpte must be obtained by an atomic operation to ensure no A/D bits are lost (as you say). Yet, I do not see a correctness problem. Am I missing something?
Powered by blists - more mailing lists