[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YJ54sgmMoV+bVU7Q@t490s>
Date: Fri, 14 May 2021 09:18:42 -0400
From: Peter Xu <peterx@...hat.com>
To: Hugh Dickins <hughd@...gle.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Nadav Amit <nadav.amit@...il.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Mike Rapoport <rppt@...ux.vnet.ibm.com>,
Andrea Arcangeli <aarcange@...hat.com>,
Jerome Glisse <jglisse@...hat.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Jason Gunthorpe <jgg@...pe.ca>,
Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Axel Rasmussen <axelrasmussen@...gle.com>,
"Kirill A . Shutemov" <kirill@...temov.name>
Subject: Re: [PATCH v2 00/24] userfaultfd-wp: Support shmem and hugetlbfs
Hugh,
On Fri, May 14, 2021 at 12:07:38AM -0700, Hugh Dickins wrote:
> On Wed, 12 May 2021, Peter Xu wrote:
> > On Tue, Apr 27, 2021 at 12:12:53PM -0400, Peter Xu wrote:
> > > This is v2 of uffd-wp shmem & hugetlbfs support, which completes uffd-wp as a
> > > kernel full feature, as it only supports anonymous before this series.
> >
> > Ping..
> >
> > Thinking about a repost, as this series shouldn't be able to apply after we've
> > got more relevant patches into -mm. E.g., the full minor fault, and also some
> > small stuff like pagemap, as we need one more patch to support shmem/hugetlbfs
> > too.
> >
> > Hugh, haven't received any further comment from you on shmem side (or on the
> > general idea). It would be great to still have some of your input.
> >
> > Let me know if you prefer to read a fresh new version otherwise.
>
> I am very sorry to let you down, Peter, repeatedly; but it is now very
> clear that I shall *never* have time to review your patchset - I am too
> slow, have too much else to attend to, and take too long each time to
> sink myself deep enough into userfaultfd.
Never mind! It's just that I'm kind of obliged to ask for your opinion as you
contributed part of the idea while you are also the shmem maintainer. :) So
that's what I did before I start to bother Andrew (since I know Andrew is 100%
busy.. that's also why I tend to not ask Andrew for review pings as best as I
can for all my works; while Andrew can chim in anytime anyways as in the loop).
>
> I realize that you're being considerate, and expecting no more than
> a few comments from me, rather than asking for formal review; but it's
> still too much for me to get into.
I'm actually even be prepared to receive a full-series NACK anytime. :) To me
it's more important to have the right direction first, as I didn't receive that
during RFC so I moved on, assuming no one thinks it wrong. However it's indeed
true that you never let me down (as far as I see from the other discussions)
that you do very in-depth review to hunt down any single potential risks you
may have noticed even in an rare error path - that's just too attractive a
reviewer to all the patch writters!
>
> The only reason I was involved at all, was when you were wondering how
> to handle the pagetable entries for shmem. I suggested one encoding,
> Andrea suggested slightly differently: Andrea's was more elegant (no
> "swap type" required), and it looked like you went with his - good.
>
> I wonder whether you noticed
> https://lore.kernel.org/linux-mm/20210407084238.20443-2-apopple@nvidia.com/
> which might interfere. I've had no more time to look at that than yours,
> so no opinion on it (and I don't know what happened to it after that).
Thanks for the pointer. Looks like there'll be some slight rebase work and
totally orthogonal on the ideas, then we'll see who will do the rebase (yeh
probably me :).
Hmm, meanwhile if that's the initial versions I might go and suggest a renaming
of pfn_swap_entry_to_page() to start with pte_swp_*() as it operates on swp pte
not a pfn. However probably too late for a v8 series so I'll give up. It also
has mentioned something like "special swap pte", hope that won't get confused
with what this series is proposing. We'll see when it becomes a problem, so
far seems still okay.
>
> Please keep uppermost in mind when modifying mm/shmem.c for userfaultfd,
> the difference between shared and private; and be on guard against the
> ways in which CONFIG_USERFAULTFD=y might open a door to abuse.
Will do. Then I'll move this series on.
Re shared/private, let me mention one thing just in case for any use of peace
of mind: the most dangerous place for uffd-wp+shmem should be the
UFFDIO_WRITEPROTECT page resolving ioctl when we want to re-grant the write bit
to ptes if needed (for minor mode, the danger point is UFFDIO_CONTINUE
instead), however it should be even safer than UFFDIO_CONTINUE as
UFFDIO_WRITEPROTECT never grants the write bit for real but leave that all to
page fault handler (in change_pte_range()):
} else if (uffd_wp_resolve) {
/*
* Leave the write bit to be handled
* by PF interrupt handler, then
* things like COW could be properly
* handled.
*/
ptent = pte_clear_uffd_wp(ptent);
}
While the newprot will never have the write bit either afaik, mwriteprotect_range():
newprot = vm_get_page_prot(dst_vma->vm_flags);
The last risk is the dirty_accountable trick in change_pte_range(), but as you
analyzed in the other thread, userfaultfd never uses MM_CP_DIRTY_ACCT, so it
should be safe too.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists