[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211218030509.GA1432915@nvidia.com>
Date: Fri, 17 Dec 2021 23:05:09 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Nadav Amit <namit@...are.com>,
David Hildenbrand <david@...hat.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Shakeel Butt <shakeelb@...gle.com>,
John Hubbard <jhubbard@...dia.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Mike Rapoport <rppt@...ux.ibm.com>,
Yang Shi <shy828301@...il.com>,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Matthew Wilcox <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, Jann Horn <jannh@...gle.com>,
Michal Hocko <mhocko@...nel.org>,
Rik van Riel <riel@...riel.com>,
Roman Gushchin <guro@...com>,
Andrea Arcangeli <aarcange@...hat.com>,
Peter Xu <peterx@...hat.com>,
Donald Dutile <ddutile@...hat.com>,
Christoph Hellwig <hch@....de>,
Oleg Nesterov <oleg@...hat.com>, Jan Kara <jack@...e.cz>,
Linux-MM <linux-mm@...ck.org>,
"open list:KERNEL SELFTEST FRAMEWORK"
<linux-kselftest@...r.kernel.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>
Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via
FAULT_FLAG_UNSHARE (!hugetlb)
On Fri, Dec 17, 2021 at 05:53:45PM -0800, Linus Torvalds wrote:
> But honestly, at least for the second case, if somebody does a GUP,
> and then starts playing mprotect games on the same virtual memory area
> that they did a GUP on, and are surprised when they get another COW
> fault that breaks their own connection with a page they did a GUP on
> earlier, that's their own fault.
I've been told there are real workloads that do this.
Something like qemu will use GUP with VFIO to insert PCI devices into
the guest and GUP with RDMA to do fast network copy of VM memory
during VM migration.
qemu also uses the WP games to implement dirty tracking of VM memory
during migration (and more? I'm not sure). It expects that during all
of this nothing will COW the pages, as the two kinds of DMA must
always go to the pages mapped to KVM.
The big trouble here is this all worked before, so it is a userspace
visible regression.
Can this be made to work at all? I wonder if qemu uses MAP_SHARED, eg
via a memfd or something, does the COW then go away naturally?
Jason
Powered by blists - more mailing lists