[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200924183418.GJ79898@xz-x1>
Date: Thu, 24 Sep 2020 14:34:18 -0400
From: Peter Xu <peterx@...hat.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: John Hubbard <jhubbard@...dia.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Jan Kara <jack@...e.cz>, Michal Hocko <mhocko@...e.com>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Kirill Shutemov <kirill@...temov.name>,
Hugh Dickins <hughd@...gle.com>,
Christoph Hellwig <hch@....de>,
Andrea Arcangeli <aarcange@...hat.com>,
Oleg Nesterov <oleg@...hat.com>,
Leon Romanovsky <leonro@...dia.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned
On Thu, Sep 24, 2020 at 03:15:01PM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 24, 2020 at 01:55:31PM -0400, Peter Xu wrote:
> > On Thu, Sep 24, 2020 at 01:51:52PM -0300, Jason Gunthorpe wrote:
> > > > Regarding the solution here, I think we can also cover read-only fast-gup too
> > > > in the future - IIUC what we need to do is to make it pte_protnone() instead of
> > > > pte_wrprotect(), then in the fault handler we should identify this special
> > > > pte_protnone() against numa balancing (change_prot_numa()). I think it should
> > > > work fine too, iiuc, because I don't think we should migrate a page at all if
> > > > it's pinned for any reason...
> >
> > [1]
> >
> > >
> > > With your COW breaking patch the read only fast-gup should break the
> > > COW because of the write protect, just like for the write side. Not
> > > seeing why we need to do something more?
> >
> > Consider this sequence of a parent process managed to fork() a child:
> >
> > buf = malloc();
Sorry! I think I missed something like:
mprotect(buf, !WRITE);
Here.
> > // RDONLY gup
> > pin_user_pages(buf, !WRITE);
> > // pte of buf duplicated on both sides
> > fork();
> > mprotect(buf, WRITE);
> > *buf = 1;
> > // buf page replaced as cow triggered
> >
> > Currently when fork() we'll happily share a pinned read-only page with the
> > child by copying the pte directly.
>
> Why? This series prevents that, the page will be maybe_dma_pinned, so
> fork() will copy it.
With the extra mprotect(!WRITE), I think we'll see a !pte_write() entry. Then
it'll not go into maybe_dma_pinned() at all since cow==false.
>
> > As a summary: imho the important thing is we should not allow any kind of
> > sharing of any dma page, even it's pinned for read.
>
> Any sharing that results in COW. MAP_SHARED is fine, for instance
Oh right, MAP_SHARED is definitely special.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists