linux-kernel - Re: [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAHS8izO_YAsYxxrCpSMNe2V5cV-zfsW=Xu4-suEHVPetkGSuBA@mail.gmail.com>
Date:   Wed, 12 May 2021 14:52:54 -0700
From:   Mina Almasry <almasrymina@...gle.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>
Cc:     Peter Xu <peterx@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux-MM <linux-mm@...ck.org>,
        open list <linux-kernel@...r.kernel.org>,
        Axel Rasmussen <axelrasmussen@...gle.com>
Subject: Re: [PATCH] mm, hugetlb: fix resv_huge_pages underflow on UFFDIO_COPY

On Wed, May 12, 2021 at 2:31 PM Mike Kravetz <mike.kravetz@...cle.com> wrote:
>
> On 5/12/21 1:14 PM, Peter Xu wrote:
> > On Wed, May 12, 2021 at 12:42:32PM -0700, Mina Almasry wrote:
> >>>>> @@ -4868,30 +4869,39 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> >>>>> +       WARN_ON(*pagep);
> >>>>
> >>>> I don't think this warning works, because we do set *pagep, in the
> >>>> copy_huge_page_from_user failure case. In that case, the following
> >>>> happens:
> >>>>
> >>>> 1. We set *pagep, and return immediately.
> >>>> 2. Our caller notices this particular error, drops mmap_lock, and then
> >>>> calls us again with *pagep set.
> >>>>
> >>>> In this path, we're supposed to just re-use this existing *pagep
> >>>> instead of allocating a second new page.
> >>>>
> >>>> I think this also means we need to keep the "else" case where *pagep
> >>>> is set below.
> >>>>
> >>>
> >>> +1 to Peter's comment.
> >>>
>
> Apologies to Axel (and Peter) as that comment was from Axel.
>
> >>
> >> Gah, sorry about that. I'll fix in v2.
> >
> > I have a question regarding v1: how do you guarantee huge_add_to_page_cache()
> > won't fail again even if checked before page alloc?  Say, what if the page
> > cache got inserted after hugetlbfs_pagecache_present() (which is newly added in
> > your v1) but before huge_add_to_page_cache()?
>
> In the caller (__mcopy_atomic_hugetlb) we obtain the hugetlb fault mutex
> before calling this routine.  This should prevent changes to the cache
> while in the routine.
>
> However, things get complicated in the case where copy_huge_page_from_user
> fails.  In this case, we will return to the caller which will drop mmap_lock
> and the hugetlb fault mutex before doing the copy.  After dropping the
> mutex, someone could populate the cache.  This would result in the same
> situation where two reserves are 'temporarily' consumed for the same
> mapping offset.  By the time we get to the second call to
> hugetlb_mcopy_atomic_pte where the previously allocated page is passed
> in, it is too late.
>

Thanks. I tried locally to allocate a page, then add it into the
cache, *then* copy its contents (dropping that lock if that fails).
That also has the test passing, but I'm not sure if I'm causing a fire
somewhere else by having a page in the cache that has uninitialized
contents. The only other code that checks the cache seems to be the
hugetlb_fault/hugetlb_cow code. I'm reading that code to try to
understand if I'm breaking that code doing this.

> --
> Mike Kravetz