[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADrL8HXqE3s4ckxh0OU5onkhystj=1jMTS+S7GFeiO+kwBo0QQ@mail.gmail.com>
Date: Wed, 21 Dec 2022 15:21:18 -0500
From: James Houghton <jthoughton@...gle.com>
To: Peter Xu <peterx@...hat.com>
Cc: Mike Kravetz <mike.kravetz@...cle.com>,
Muchun Song <songmuchun@...edance.com>,
David Hildenbrand <david@...hat.com>,
David Rientjes <rientjes@...gle.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Mina Almasry <almasrymina@...gle.com>,
"Zach O'Keefe" <zokeefe@...gle.com>,
Manish Mishra <manish.mishra@...anix.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
"Dr . David Alan Gilbert" <dgilbert@...hat.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Yang Shi <shy828301@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 33/47] userfaultfd: add UFFD_FEATURE_MINOR_HUGETLBFS_HGM
On Wed, Dec 21, 2022 at 2:23 PM Peter Xu <peterx@...hat.com> wrote:
>
> James,
>
> On Wed, Nov 16, 2022 at 03:30:00PM -0800, James Houghton wrote:
> > On Wed, Nov 16, 2022 at 2:28 PM Peter Xu <peterx@...hat.com> wrote:
> > >
> > > On Fri, Oct 21, 2022 at 04:36:49PM +0000, James Houghton wrote:
> > > > Userspace must provide this new feature when it calls UFFDIO_API to
> > > > enable HGM. Userspace can check if the feature exists in
> > > > uffdio_api.features, and if it does not exist, the kernel does not
> > > > support and therefore did not enable HGM.
> > > >
> > > > Signed-off-by: James Houghton <jthoughton@...gle.com>
> > >
> > > It's still slightly a pity that this can only be enabled by an uffd context
> > > plus a minor fault, so generic hugetlb users cannot directly leverage this.
> >
> > The idea here is that, for applications that can conceivably benefit
> > from HGM, we have a mechanism for enabling it for that application. So
> > this patch creates that mechanism for userfaultfd/UFFDIO_CONTINUE. I
> > prefer this approach over something more general like MADV_ENABLE_HGM
> > or something.
>
> Sorry to get back to this very late - I know this has been discussed since
> the very early stage of the feature, but is there any reasoning behind?
>
> When I start to think seriously on applying this to process snapshot with
> uffd-wp I found that the minor mode trick won't easily play - normally
> that's a case where all the pages were there mapped huge, but when the app
> wants UFFDIO_WRITEPROTECT it may want to remap the huge pages into smaller
> pages, probably some size that the user can specify. It'll be non-trivial
> to enable HGM during that phase using MINOR mode because in that case the
> pages are all mapped.
>
> For the long term, I am just still worried the current interface is still
> not as flexible.
Thanks for bringing this up, Peter. I think the main reason was:
having separate UFFD_FEATUREs clearly indicates to userspace what is
and is not supported.
For UFFDIO_WRITEPROTECT, a user could remap huge pages into smaller
pages by issuing a high-granularity UFFDIO_WRITEPROTECT. That isn't
allowed as of this patch series, but it could be allowed in the
future. To add support in the same way as this series, we would add
another feature, say UFFD_FEATURE_WP_HUGETLBFS_HGM. I agree that
having to add another feature isn't great; is this what you're
concerned about?
Considering MADV_ENABLE_HUGETLB...
1. If a user provides this, then the contract becomes: "the kernel may
allow UFFDIO_CONTINUE and UFFDIO_WRITEPROTECT for HugeTLB at
high-granularities, provided the support exists", but it becomes
unclear to userspace to know what's supported and what isn't.
2. We would then need to keep track if a user explicitly enabled it,
or if it got enabled automatically in response to memory poison, for
example. Not a big problem, just a complication. (Otherwise, if HGM
got enabled for poison, suddenly userspace would be allowed to do
things it wasn't allowed to do before.)
3. This API makes sense for enabling HGM for something outside of
userfaultfd, like MADV_DONTNEED.
Maybe (1) is solvable if we provide a bit field that describes what's
supported, or maybe (1) isn't even a problem.
Another possibility is to have a feature like
UFFD_FEATURE_HUGETLB_HGM, which will enable the possibility of HGM for
all relevant userfaultfd ioctls, but we have the same problem where
it's unclear what's supported and what isn't.
I'm happy to change the API to whatever you think makes the most sense.
Thanks!
- James
>
> --
> Peter Xu
>
Powered by blists - more mailing lists