[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200915151746.GB2949@xz-x1>
Date: Tue, 15 Sep 2020 11:17:46 -0400
From: Peter Xu <peterx@...hat.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Leon Romanovsky <leonro@...dia.com>,
Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"Maya B . Gokhale" <gokhale2@...l.gov>,
Yang Shi <yang.shi@...ux.alibaba.com>,
Marty Mcfadden <mcfadden8@...l.gov>,
Kirill Shutemov <kirill@...temov.name>,
Oleg Nesterov <oleg@...hat.com>, Jann Horn <jannh@...gle.com>,
Jan Kara <jack@...e.cz>, Kirill Tkhai <ktkhai@...tuozzo.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Christoph Hellwig <hch@....de>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 1/4] mm: Trial do_wp_page() simplification
On Tue, Sep 15, 2020 at 10:50:40AM -0400, Peter Xu wrote:
> Hi, all,
>
> I prepared another version of the FOLL_PIN enforced cow patch attached, just in
> case it would still be anything close to useful (though now I highly doubt it
> considering below...). I took care of !USERFAULTFD as suggested by Leon, and
> also the fast gup path.
Now with the patch attached (for real..).
>
> However...
>
> On Mon, Sep 14, 2020 at 08:28:51PM -0300, Jason Gunthorpe wrote:
> > Yes, this stuff does pin_user_pages_fast() and MADV_DONTFORK
> > together. It sets FOLL_FORCE and FOLL_WRITE to get an exclusive copy
> > of the page and MADV_DONTFORK was needed to ensure that a future fork
> > doesn't establish a COW that would break the DMA by moving the
> > physical page over to the fork. DMA should stay with the process that
> > called pin_user_pages_fast() (Is MADV_DONTFORK still needed with
> > recent years work to GUP/etc? It is a pretty terrible ancient thing)
>
> ... Now I'm more confused on what has happened.
>
> If we're with FORCE|WRITE, iiuc it should guarantee that the page will trigger
> COW during gup even if it is shared, so no problem on the gup side. Then I'm
> quite confused on why the write bit is not set when cow triggered.
>
> E.g., in wp_page_copy(), if I'm not wrong, the write bit is only controlled by
> (besides the fix patch, though I believe the rdma test should have nothing to
> do with uffd-wp after all so it should be the same anyways):
>
> entry = maybe_mkwrite(pte_mkdirty(entry), vma);
>
> It means, as long as the rdma region has VM_WRITE set (which I think of no
> reason on why it shouldn't...), then it should have the write bit in the COWed
> page entry. If so, the page should be stable and I don't undersdand why
> another COW could even trigger and how the code path in the "trial cow" patch
> is triggered.
>
> Or, the VMA is without VM_WRITE due to some reason? Sorry I probably know
> nothing about RDMA, more information on that side might help too. E.g., is the
> hardware going to walk the software process page table too when doing RDMA (or
> is IOMMU page table used, or none)?
>
> Thanks,
>
> --
> Peter Xu
--
Peter Xu
View attachment "0001-mm-gup-Allow-enfornced-COW-for-FOLL_PIN.patch" of type "text/plain" (11548 bytes)
Powered by blists - more mailing lists