[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wiwt4LC-VmqvYrphraF0=yQV=CQimDCb0XhtXwk8oKCCA@mail.gmail.com>
Date: Sun, 30 Oct 2022 18:47:23 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Nadav Amit <nadav.amit@...il.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Jann Horn <jannh@...gle.com>,
John Hubbard <jhubbard@...dia.com>, X86 ML <x86@...nel.org>,
Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
kernel list <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>,
Andrea Arcangeli <aarcange@...hat.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
jroedel@...e.de, ubizjak@...il.com,
Alistair Popple <apopple@...dia.com>
Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment
On Sun, Oct 30, 2022 at 3:47 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> Anyway, this simplification patch basically means that the *next* step
> could be to just move that ipage_zap_pte_rmap()' after the TLB flush,
> and now it's trivial and no longer scary.
So here's that next patch: it's patch 4/4 in this series.
Patches 1-3 are the patches that I already sent out as one (smaller)
combined patch. I'm including them here as the whole series in case
somebody else wants to follow along with how I did the simplified
version of page_remove_rmap().
So if you already looked at the previous patch and don't have any
questions about that one, you could just apply PATCH 4/4 on top of
that one.
Or you can do the whole series with commit messages and (hopefully)
clearer individual steps to the simplified version of
page_remove_rmap().
I haven't actually tested 4/4 yet, so this is yet another of my "I
think this should work" patches.
The reason I haven't actually tested it is partly because I never
recreated the original problem Navav reported, and partly because the
meat of patch 4/4 is just the same "encode an extra flag bit in the
low bit of the page pointer" that I _did_ test, just doing the "remove
rmap" instead of "set dirty".
In other words, I *think* this should make Nadav's test-case happy,
and avoid the warning he saw.
If somebody wants a git branch, I guess I can push that too, but it
would be a non-stable branch only for testing.
Also, it's worth noting that zap_pte_range() does this sanity test:
if (unlikely(page_mapcount(page) < 0))
print_bad_pte(vma, addr, ptent, page);
and that is likely worthless now (because it hasn't actually
decremented the mapcount yet). I didn't remove it, because I wasn't
sure which option was best:
(a) just remove it entirely
(b) change the "< 0" to "<= 0"
(c) move it to clean_and_free_pages_and_swap_cache() that actually
does the page_zap_pte_rmap() now.
so I'm in no way claiming this series is any kind of final word, but I
do mostly like how the code ended up looking.
Now, this whole "remove rmap after flush" would probably make sense
for some of the largepage cases too, and there's room for another bit
in there (or more, if you align 'struct page') and that whole code
could use page size hints etc too. But I suspect that the main case we
really care is for individual small pages, just because that's the
case where I think it would be much worse to do any fancy TLB
tracking. The largepage cases presumably aren't as critical, since
there by definition is fewer of those.
Comments?
Linus
View attachment "0001-mm-introduce-simplified-versions-of-page_remove_rmap.patch" of type "text/x-patch" (5434 bytes)
View attachment "0002-mm-inline-simpler-case-of-page_remove_file_rmap.patch" of type "text/x-patch" (1914 bytes)
View attachment "0003-mm-re-unify-the-simplified-page_zap_-_rmap-function.patch" of type "text/x-patch" (4013 bytes)
View attachment "0004-mm-delay-rmap-removal-until-after-TLB-flush.patch" of type "text/x-patch" (8374 bytes)
Powered by blists - more mailing lists