[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yv08XTRv3I5zY4M5@xz-m1.local>
Date: Wed, 17 Aug 2022 15:07:09 -0400
From: Peter Xu <peterx@...hat.com>
To: Alistair Popple <apopple@...dia.com>
Cc: huang ying <huang.ying.caritas@...il.com>, linux-mm@...ck.org,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
"Sierra Guiza, Alejandro (Alex)" <alex.sierra@....com>,
Felix Kuehling <Felix.Kuehling@....com>,
Jason Gunthorpe <jgg@...dia.com>,
John Hubbard <jhubbard@...dia.com>,
David Hildenbrand <david@...hat.com>,
Ralph Campbell <rcampbell@...dia.com>,
Matthew Wilcox <willy@...radead.org>,
Karol Herbst <kherbst@...hat.com>,
Lyude Paul <lyude@...hat.com>, Ben Skeggs <bskeggs@...hat.com>,
Logan Gunthorpe <logang@...tatee.com>, paulus@...abs.org,
linuxppc-dev@...ts.ozlabs.org, Huang Ying <ying.huang@...el.com>,
stable@...r.kernel.org
Subject: Re: [PATCH v2 1/2] mm/migrate_device.c: Copy pte dirty bit to page
On Wed, Aug 17, 2022 at 03:41:16PM +1000, Alistair Popple wrote:
> My primary concern with batching is ensuring a CPU write after clearing
> a clean PTE but before flushing the TLB does the "right thing" (ie. faults
> if the PTE is not present).
Fair enough. Exactly I have that same concern. But I think Nadav replied
very recently on this in the previous thread, quotting from him [1]:
I keep not remembering this erratum correctly. IIRC, the erratum says
that the access/dirty might be set, but it does not mean that a write is
possible after the PTE is cleared (i.e., the dirty/access might be set on
the non-present PTE, but the access itself would fail). So it is not an
issue in this case - losing A/D would not impact correctness since the
access should fail.
I don't really know whether he means this, but I really think the hardware
should behave like that or otherwise I can't see how it can go right.
Let's assume if after pte cleared the page can still be written, then
afaict ptep_clear_flush() is not safe either, because fundamentally it is
two operations happening in sequence, of: (1) ptep_get_and_clear(), and (2)
conditionally do flush_tlb_page() when needed.
If page can be written with TLB cached but without pte present, what if
some process writes to memory during step (1) and (2)? AFAIU that's the
same question as using raw ptep_get_and_clear() and a batched tlb flush.
IOW, I don't see how a tlb batched solution can be worse than using per-pte
ptep_clear_flush(). It may enlarge the race window but fundamentally
(iiuc) they're the same thing here as long as there's no atomic way to both
"clear pte and flush tlb".
[1] https://lore.kernel.org/lkml/E37036E0-566E-40C7-AD15-720CDB003227@gmail.com/
--
Peter Xu
Powered by blists - more mailing lists