lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADrL8HXqE3s4ckxh0OU5onkhystj=1jMTS+S7GFeiO+kwBo0QQ@mail.gmail.com>
Date:   Wed, 21 Dec 2022 15:21:18 -0500
From:   James Houghton <jthoughton@...gle.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     Mike Kravetz <mike.kravetz@...cle.com>,
        Muchun Song <songmuchun@...edance.com>,
        David Hildenbrand <david@...hat.com>,
        David Rientjes <rientjes@...gle.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Mina Almasry <almasrymina@...gle.com>,
        "Zach O'Keefe" <zokeefe@...gle.com>,
        Manish Mishra <manish.mishra@...anix.com>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Yang Shi <shy828301@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 33/47] userfaultfd: add UFFD_FEATURE_MINOR_HUGETLBFS_HGM

On Wed, Dec 21, 2022 at 2:23 PM Peter Xu <peterx@...hat.com> wrote:
>
> James,
>
> On Wed, Nov 16, 2022 at 03:30:00PM -0800, James Houghton wrote:
> > On Wed, Nov 16, 2022 at 2:28 PM Peter Xu <peterx@...hat.com> wrote:
> > >
> > > On Fri, Oct 21, 2022 at 04:36:49PM +0000, James Houghton wrote:
> > > > Userspace must provide this new feature when it calls UFFDIO_API to
> > > > enable HGM. Userspace can check if the feature exists in
> > > > uffdio_api.features, and if it does not exist, the kernel does not
> > > > support and therefore did not enable HGM.
> > > >
> > > > Signed-off-by: James Houghton <jthoughton@...gle.com>
> > >
> > > It's still slightly a pity that this can only be enabled by an uffd context
> > > plus a minor fault, so generic hugetlb users cannot directly leverage this.
> >
> > The idea here is that, for applications that can conceivably benefit
> > from HGM, we have a mechanism for enabling it for that application. So
> > this patch creates that mechanism for userfaultfd/UFFDIO_CONTINUE. I
> > prefer this approach over something more general like MADV_ENABLE_HGM
> > or something.
>
> Sorry to get back to this very late - I know this has been discussed since
> the very early stage of the feature, but is there any reasoning behind?
>
> When I start to think seriously on applying this to process snapshot with
> uffd-wp I found that the minor mode trick won't easily play - normally
> that's a case where all the pages were there mapped huge, but when the app
> wants UFFDIO_WRITEPROTECT it may want to remap the huge pages into smaller
> pages, probably some size that the user can specify.  It'll be non-trivial
> to enable HGM during that phase using MINOR mode because in that case the
> pages are all mapped.
>
> For the long term, I am just still worried the current interface is still
> not as flexible.

Thanks for bringing this up, Peter. I think the main reason was:
having separate UFFD_FEATUREs clearly indicates to userspace what is
and is not supported.

For UFFDIO_WRITEPROTECT, a user could remap huge pages into smaller
pages by issuing a high-granularity UFFDIO_WRITEPROTECT. That isn't
allowed as of this patch series, but it could be allowed in the
future. To add support in the same way as this series, we would add
another feature, say UFFD_FEATURE_WP_HUGETLBFS_HGM. I agree that
having to add another feature isn't great; is this what you're
concerned about?

Considering MADV_ENABLE_HUGETLB...
1. If a user provides this, then the contract becomes: "the kernel may
allow UFFDIO_CONTINUE and UFFDIO_WRITEPROTECT for HugeTLB at
high-granularities, provided the support exists", but it becomes
unclear to userspace to know what's supported and what isn't.
2. We would then need to keep track if a user explicitly enabled it,
or if it got enabled automatically in response to memory poison, for
example. Not a big problem, just a complication. (Otherwise, if HGM
got enabled for poison, suddenly userspace would be allowed to do
things it wasn't allowed to do before.)
3. This API makes sense for enabling HGM for something outside of
userfaultfd, like MADV_DONTNEED.

Maybe (1) is solvable if we provide a bit field that describes what's
supported, or maybe (1) isn't even a problem.

Another possibility is to have a feature like
UFFD_FEATURE_HUGETLB_HGM, which will enable the possibility of HGM for
all relevant userfaultfd ioctls, but we have the same problem where
it's unclear what's supported and what isn't.

I'm happy to change the API to whatever you think makes the most sense.

Thanks!
- James

>
> --
> Peter Xu
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ