[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wgKft6E_EeLA1GnEXcQBA9vu8m2B-M-U7PuiNa0+9gpHA@mail.gmail.com>
Date: Fri, 17 Dec 2021 11:22:47 -0800
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: David Hildenbrand <david@...hat.com>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
John Hubbard <jhubbard@...dia.com>,
Jason Gunthorpe <jgg@...dia.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Yang Shi <shy828301@...il.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Matthew Wilcox <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
Michal Hocko <mhocko@...nel.org>,
Nadav Amit <namit@...are.com>, Rik van Riel <riel@...riel.com>,
Roman Gushchin <guro@...com>,
Andrea Arcangeli <aarcange@...hat.com>,
Peter Xu <peterx@...hat.com>,
Donald Dutile <ddutile@...hat.com>,
Christoph Hellwig <hch@....de>,
Oleg Nesterov <oleg@...hat.com>, Jan Kara <jack@...e.cz>,
Linux-MM <linux-mm@...ck.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via
FAULT_FLAG_UNSHARE (!hugetlb)
On Fri, Dec 17, 2021 at 11:04 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> If we are doing a COW, we need an *exclusive* access to the page. That
> is not mapcount, that is the page ref.
>
> mapcount is insane, and I think this is making this worse again.
Maybe I'm misreading this, but afaik
- get a "readonly" copy of a local private page using FAULT_FLAG_UNSHARE.
This just increments the page count, because mapcount == 1.
- fork()
- unmap in the original
- child now has "mapcount == 1" on a page again, but refcount is
elevated, and child HAS TO COW before writing.
Notice? "mapcount" is complete BS. The number of times a page is
mapped is irrelevant for COW. All that matters is that we get an
exclusive access to the page before we can write to it.
Anybody who takes mapcount into account at COW time is broken, and it
worries me how this is all mixing up with the COW logic.
Now, maybe this "unshare" case is sufficiently different from COW that
it's ok to look at mapcount for FAULT_FLAG_UNSHARE, as long as it
doesn't happen for a real COW.
But honestly, for "unshare", I still don't see that the mapcount
matters. What does "mapcount == 1" mean? Why is it meaningful?
Because if COW does things right, and always breaks a COW based on
refcount, then what's the problem with taking a read-only ref to the
page whether it is mapped multiple times or mapped just once? Anybody
who already had write access to the page can write to it regardless,
and any new writers go through COW and get a new page.
I must be missing something realyl fundamental here, but to me it
really reads like "mapcount can fundamentally never be relevant for
COW, and if it's not relevant for COW, how can it be relevant for a
read-only copy?"
Linus
Powered by blists - more mailing lists