linux-kernel - Re: [PATCH v2 05/10] mm/hugetlb: Make userfaultfd_huge_must

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Y5ElX/Flm7lrHmCx@x1n>
Date:   Wed, 7 Dec 2022 18:44:31 -0500
From:   Peter Xu <peterx@...hat.com>
To:     John Hubbard <jhubbard@...dia.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Muchun Song <songmuchun@...edance.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        James Houghton <jthoughton@...gle.com>,
        Jann Horn <jannh@...gle.com>, Rik van Riel <riel@...riel.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        David Hildenbrand <david@...hat.com>,
        Nadav Amit <nadav.amit@...il.com>
Subject: Re: [PATCH v2 05/10] mm/hugetlb: Make userfaultfd_huge_must_wait()
 safe to pmd unshare

On Wed, Dec 07, 2022 at 03:19:55PM -0800, John Hubbard wrote:
> On 12/7/22 12:30, Peter Xu wrote:
> > We can take the hugetlb walker lock, here taking vma lock directly.
> > 
> > Reviewed-by: David Hildenbrand <david@...hat.com>
> > Reviewed-by: Mike Kravetz <mike.kravetz@...cle.com>
> > Signed-off-by: Peter Xu <peterx@...hat.com>
> > ---
> >   fs/userfaultfd.c | 18 ++++++++++++++----
> >   1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > index 07c81ab3fd4d..a602f008dde5 100644
> > --- a/fs/userfaultfd.c
> > +++ b/fs/userfaultfd.c
> > @@ -376,7 +376,8 @@ static inline unsigned int userfaultfd_get_blocking_state(unsigned int flags)
> >    */
> >   vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
> >   {
> > -	struct mm_struct *mm = vmf->vma->vm_mm;
> > +	struct vm_area_struct *vma = vmf->vma;
> > +	struct mm_struct *mm = vma->vm_mm;
> >   	struct userfaultfd_ctx *ctx;
> >   	struct userfaultfd_wait_queue uwq;
> >   	vm_fault_t ret = VM_FAULT_SIGBUS;
> > @@ -403,7 +404,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
> >   	 */
> >   	mmap_assert_locked(mm);
> > -	ctx = vmf->vma->vm_userfaultfd_ctx.ctx;
> > +	ctx = vma->vm_userfaultfd_ctx.ctx;
> >   	if (!ctx)
> >   		goto out;
> > @@ -493,6 +494,13 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
> >   	blocking_state = userfaultfd_get_blocking_state(vmf->flags);
> > +	/*
> > +	 * This stablizes pgtable for hugetlb on e.g. pmd unsharing.  Need
> > +	 * to be before setting current state.
> > +	 */
> 
> Looking at this code, I am not able to come up with a reason for why the
> vma lock/unlock placement is exactly where it is. It looks quite arbitrary.
> 
> Why not, for example, take and drop the vma lock within
> userfaultfd_huge_must_wait()? That makes more sense to me, but I'm not familiar
> with userfaultfd so of course I'm missing something.
> 
> But the comment above certainly doesn't supply that something.

The part that matters in the comment is "need to be before setting current
state".

	blocking_state = userfaultfd_get_blocking_state(vmf->flags);
	if (is_vm_hugetlb_page(vma))
		hugetlb_vma_lock_read(vma);
	set_current_state(blocking_state);

down_read() can sleep and also modify the task state, we cannot take the
lock after that point because otherwise the task state will be messed up.

-- 
Peter Xu