[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wjvoTRSb87R-D50yOXqX4mshjiiAyurAKCsdW0_J+sf7A@mail.gmail.com>
Date: Fri, 17 Dec 2021 15:20:24 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: David Hildenbrand <david@...hat.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
John Hubbard <jhubbard@...dia.com>,
Jason Gunthorpe <jgg@...dia.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Yang Shi <shy828301@...il.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Matthew Wilcox <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
Michal Hocko <mhocko@...nel.org>,
Nadav Amit <namit@...are.com>, Rik van Riel <riel@...riel.com>,
Roman Gushchin <guro@...com>,
Andrea Arcangeli <aarcange@...hat.com>,
Peter Xu <peterx@...hat.com>,
Donald Dutile <ddutile@...hat.com>,
Christoph Hellwig <hch@....de>,
Oleg Nesterov <oleg@...hat.com>, Jan Kara <jack@...e.cz>,
Linux-MM <linux-mm@...ck.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via
FAULT_FLAG_UNSHARE (!hugetlb)
On Fri, Dec 17, 2021 at 2:43 PM David Hildenbrand <david@...hat.com> wrote:
>
> The pages stay PageAnon(). swap-backed pages simply set a bit IIRC.
> mapcount still applies.
Our code-base is too large for me to remember all the details, but if
we still end up having PageAnon for swapbacked pages, then mapcount
can increase from another process faulting in an pte with that swap
entry.
And mmap_sem doesn't protect against that. Again, page_lock() does.
And taking the page lock was a big performance issue.
One of the reasons that new COW handling is so nice is that you can do
things like
if (!trylock_page(page))
goto copy;
exactly because in the a/b world order, the copy case is always safe.
In your model, as far as I can tell, you leave the page read-only and
a subsequent COW fault _can_ happen, which means that now the
subsequent COW needs to b every very careful, because if it ever
copies a page that was GUP'ed, you just broke the rules.
So COWing too much is a bug (because it breaks the page from the GUP),
but COWing too little is an even worse problem (because it measn that
now the GUP user can see data it shouldn't have seen).
Our old code literally COWed too little. It's why all those changes
happened in the first place.
This is why I'm pushing that whole story line of
(1) COW is based purely on refcounting, because that's the only thing
that obviously can never COW too little.
(2) GUP pre-COWs (the thing I called the "(a)" rule earlier) and then
makes sure to not mark pinned pages COW again (that "(b)" rule).
and here "don't use page_mapcount()" really is about that (1).
You do seem to have kept (1) in that your COW rules don't seem to
change (but maybe I missed it), but because your GUP-vs-COW semantics
are very different indeed, I'm not at all convinced about (2).
Linus
Powered by blists - more mailing lists