[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <57864A6F.6070202@sr71.net>
Date: Wed, 13 Jul 2016 07:04:31 -0700
From: Dave Hansen <dave@...1.net>
To: Vlastimil Babka <vbabka@...e.cz>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
linux-kernel@...r.kernel.org
Cc: x86@...nel.org, linux-mm@...ck.org, torvalds@...ux-foundation.org,
akpm@...ux-foundation.org, bp@...en8.de, ak@...ux.intel.com,
mhocko@...e.com
Subject: Re: [PATCH 0/4] [RFC][v4] Workaround for Xeon Phi PTE A/D bits
erratum
On 07/13/2016 04:37 AM, Vlastimil Babka wrote:
> On 07/02/2016 12:28 AM, Benjamin Herrenschmidt wrote:
>> With the errata, don't you have a situation where a processor in
>> the second category will write and set D despite P having been
>> cleared (due to the race) and thus causing us to miss the transfer
>> of that D to the struct
>> page and essentially completely miss that the physical page is dirty ?
>
> Seems to me like this is indeed possible, but...
No, this isn't possible with the erratum.
I had some off-list follow up with Ben, and included this description in
the later post of the patch:
> These bits are truly "stray". In the case of the Dirty bit, the
> thread associated with the stray set was *not* allowed to write to
> the page. This means that we do not have to launder the bit(s); we
> can simply ignore them.
>> (Leading to memory corruption).
>
> ... what memory corruption, exactly?
In this (non-existent) scenario, we would lose writes to mmap()'d files
because we did not see the dirty bit during the "get" part of
ptep_get_and_clear().
> If a process is writing to its
> memory from one thread and unmapping it from other thread at the same
> time, there are no guarantees anyway?
It's not just unmapping, it's also swap, NUMA migration, etc... We
clear the PTE, flush, then re-populate it.
> Would anything sensible rely on
> the guarantee that if the write in such racy scenario didn't end up as a
> segfault (i.e. unmapping was faster), then it must hit the disk? Or are
> there any other scenarios where zap_pte_range() is called? Hmm, but how
> does this affect the page migration scenario, can we lose the D bit there?
Yeah, it's not just zap_pte_range(), it's everywhere that we change a
present PTE.
> And maybe related thing that just occured to me, what if page is made
> non-writable during fork() to catch COW? Any race in that one, or just
> the P bit? But maybe the argument would be the same as above...
Yeah, the argument is the same.
Powered by blists - more mailing lists