lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 4 Jan 2023 15:03:47 -0500
From:   Peter Xu <peterx@...hat.com>
To:     James Houghton <jthoughton@...gle.com>
Cc:     Mike Kravetz <mike.kravetz@...cle.com>,
        Muchun Song <songmuchun@...edance.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hugetlb: unshare some PMDs when splitting VMAs

On Wed, Jan 04, 2023 at 07:10:11PM +0000, James Houghton wrote:
> > > I'll see if I can confirm that this is indeed possible and send a
> > > repro if it is.
> >
> > I think your analysis above is correct.  The key being the failure to unshare
> > in the non-PUD_SIZE vma after the split.
> 
> I do indeed hit the WARN_ON_ONCE (repro attached), and the MADV wasn't
> even needed (the UFFDIO_REGISTER does the VMA split before "unsharing
> all PMDs"). With the fix, we avoid the WARN_ON_ONCE, but the behavior
> is still incorrect: I expect the address range to be write-protected,
> but it isn't.
> 
> The reason why is that hugetlb_change_protection uses huge_pte_offset,
> even if it's being called for a UFFDIO_WRITEPROTECT with
> UFFDIO_WRITEPROTECT_MODE_WP. In that particular case, I'm pretty sure
> we should be using huge_pte_alloc, but even so, it's not trivial to
> get an allocation failure back up to userspace. The non-hugetlb
> implementation of UFFDIO_WRITEPROTECT seems to also have this problem.
> 
> Peter, what do you think?

Indeed.  Thanks for spotting that, James.

Non-hugetlb should be fine with having empty pgtable entries. Anon doesn't
need to care about no-pgtable-populated ranges so far. Shmem does it with a
few change_prepare() calls to populate the entries so the markers can be
installed later on.

However I think the fault handling is still not well handled as you pointed
out even for shmem: that's the path I probably never triggered myself yet
before and the code stayed there since a very early version:

#define  change_pmd_prepare(vma, pmd, cp_flags)				\
	do {								\
		if (unlikely(uffd_wp_protect_file(vma, cp_flags))) {	\
			if (WARN_ON_ONCE(pte_alloc(vma->vm_mm, pmd)))	\
				break;					\
		}							\
	} while (0)

I think a better thing we can do here (instead of warning and stop the
UFFDIO_WRITEPROTECT at the current stage) is returning with -ENOMEM
properly so the user can know the error.  We'll need to touch the stacks up
to uffd_wp_range() as it's the only one that can trigger the -ENOMEM so
far, so as to not ignore retval from change_protection().

Meanwhile, I'd also wonder whether we should call pagefault_out_of_memory()
because it should be the same as when pgtable allocation failure happens in
page faults, we may want to OOM already.  I can take care of hugetlb part
too along the way.

Man page of UFFDIO_WRITEPROTECT may need a fixup too to introduce -ENOMEM.

I can quickly prepare some patches for this, and hopefully it doesn't need
to block the current fix on split.

Any thoughts?

> 
> >
> > To me, the fact it was somewhat difficult to come up with this scenario is an
> > argument what we should just unshare at split time as you propose.  Who
> > knows what other issues may exist.
> >
> > > 60dfaad65a ("mm/hugetlb: allow uffd wr-protect none ptes") is the
> > > commit that introduced the WARN_ON_ONCE; perhaps it's a good choice
> > > for a Fixes: tag (if above is indeed true).
> >
> > If the key issue in your above scenario is indeed the failure of
> > hugetlb_unshare_all_pmds in the non-PUD_SIZE vma, then perhaps we tag?
> >
> > 6dfeaff93be1 ("hugetlb/userfaultfd: unshare all pmds for hugetlbfs when
> > register wp")
> 
> SGTM. Thanks Mike.

Looks good here too.

Thanks,

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ