linux-kernel - Re: [PATCH v2 2/5] mm: avoid unnecessary flush on change_huge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4f604380-a52b-660c-af82-541dbd7652e4@intel.com>
Date:   Tue, 26 Oct 2021 12:40:03 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Nadav Amit <nadav.amit@...il.com>
Cc:     Linux-MM <linux-mm@...ck.org>, LKML <linux-kernel@...r.kernel.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Andrew Cooper <andrew.cooper3@...rix.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Peter Xu <peterx@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
        Nick Piggin <npiggin@...il.com>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v2 2/5] mm: avoid unnecessary flush on change_huge_pmd()

On 10/26/21 12:06 PM, Nadav Amit wrote:
> 
> To make it very clear - consider the following scenario, in which
> a volatile pointer p is mapped using a certain PTE, which is RW
> (i.e., *p is writable):
> 
>   CPU0				CPU1
>   ----				----
>   x = *p
>   [ PTE cached in TLB; 
>     PTE is not dirty ]
> 				clear_pte(PTE)
>   *p = x
>   [ needs to set dirty ]
> 
> Note that there is no TLB flush in this scenario. The question
> is whether the write access to *p would succeed, setting the
> dirty bit on the clear, non-present entry.
> 
> I was under the impression that the hardware AD-assist would
> recheck the PTE atomically as it sets the dirty bit. But, as I
> said, I am not sure anymore whether this is defined architecturally
> (or at least would work in practice on all CPUs modulo the 
> Knights Landing thingy).

Practically, at "x=*p", he thing that gets cached in the TLB will
Dirty=0.  At the "*p=x", the CPU will decide it needs to do a write,
find the Dirty=0 entry and will entirely discard it.  In other words, it
*acts* roughly like this:

	x = *p				
	INVLPG(p)
	*p = x;

Where the INVLPG() and the "*p=x" are atomic.  So, there's no
_practical_ problem with your scenario.  This specific behavior isn't
architectural as far as I know, though.

Although it's pretty much just academic, as for the architecture, are
you getting hung up on the difference between the description of "Accessed":

	Whenever the processor uses a paging-structure entry as part of
	linear-address translation, it sets the accessed flag in that
	entry

and "Dirty:"

	Whenever there is a write to a linear address, the processor
	sets the dirty flag (if it is not already set) in the paging-
	structure entry...

Accessed says "as part of linear-address translation", which means that
the address must have a translation.  But, the "Dirty" section doesn't
say that.  It talks about "a write to a linear address" but not whether
there is a linear address *translation* involved.

If that's it, we could probably add a bit like:

	In addition to setting the accessed flag, whenever there is a
	write...

before the dirty rules in the SDM.

Or am I being dense and continuing to miss your point? :)