lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 28 Apr 2021 14:03:05 -0700 (PDT)
From:   Hugh Dickins <hughd@...gle.com>
To:     Axel Rasmussen <axelrasmussen@...gle.com>
cc:     Peter Xu <peterx@...hat.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Hugh Dickins <hughd@...gle.com>,
        Lokesh Gidra <lokeshgidra@...gle.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] userfaultfd: release page in error path to avoid
 BUG_ON

On Wed, 28 Apr 2021, Peter Xu wrote:
> On Wed, Apr 28, 2021 at 11:01:09AM -0700, Axel Rasmussen wrote:
> > Consider the following sequence of events (described from the point of
> > view of the commit that introduced the bug - see "Fixes:" below):
> > 
> > 1. Userspace issues a UFFD ioctl, which ends up calling into
> >    shmem_mcopy_atomic_pte(). We successfully account the blocks, we
> >    shmem_alloc_page(), but then the copy_from_user() fails. We return
> >    -EFAULT. We don't release the page we allocated.
> > 2. Our caller detects this error code, tries the copy_from_user() after
> >    dropping the mmap_sem, and retries, calling back into
> >    shmem_mcopy_atomic_pte().
> > 3. Meanwhile, let's say another process filled up the tmpfs being used.
> > 4. So shmem_mcopy_atomic_pte() fails to account blocks this time, and
> >    immediately returns - without releasing the page. This triggers a
> >    BUG_ON in our caller, which asserts that the page should always be
> >    consumed, unless -EFAULT is returned.
> > 
> > (Later on in the commit history, -EFAULT became -ENOENT, mmap_sem became
> > mmap_lock, and shmem_inode_acct_block() was added.)
> 
> I suggest you do s/EFAULT/ENOENT/ directly in above.

I haven't looked into the history, but it would be best for this to
describe the situation in v5.12, never mind the details which were
different at the time of the commit tagged with Fixes.  But we stay
alert that when it's backported to stable, we may need to adjust
something to suit those releases (which will depend on how much
else has been backported to them meanwhile).

> 
> > 
> > A malicious user (even an unprivileged one) could trigger this
> > intentionally without too much trouble.

I regret having suggested that. Maybe. Opinions differ as to whether
it's helpful to call attention like that. I'd say delete that paragraph.

> > 
> > To fix this, detect if we have a "dangling" page when accounting fails,
> > and if so, release it before returning.
> > 
> > Fixes: cb658a453b93 ("userfaultfd: shmem: avoid leaking blocks and used blocks in UFFDIO_COPY")
> > Reported-by: Hugh Dickins <hughd@...gle.com>
> > Signed-off-by: Axel Rasmussen <axelrasmussen@...gle.com>

Thanks for getting on to this so quickly, Axel.
But Peter is right, that unlock_page() needs removing.

> > ---
> >  mm/shmem.c | 13 ++++++++++++-
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> > 
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 26c76b13ad23..46766c9d7151 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -2375,8 +2375,19 @@ static int shmem_mfill_atomic_pte(struct mm_struct *dst_mm,
> >  	pgoff_t offset, max_off;
> >  
> >  	ret = -ENOMEM;
> > -	if (!shmem_inode_acct_block(inode, 1))
> > +	if (!shmem_inode_acct_block(inode, 1)) {
> > +		/*
> > +		 * We may have got a page, returned -ENOENT triggering a retry,
> > +		 * and now we find ourselves with -ENOMEM. Release the page, to
> > +		 * avoid a BUG_ON in our caller.
> > +		 */
> > +		if (unlikely(*pagep)) {
> > +			unlock_page(*pagep);
> 
> Not necessary?

Worse than not necessary: would trigger a VM_BUG_ON_PAGE(). Delete!

> 
> > +			put_page(*pagep);
> > +			*pagep = NULL;
> > +		}
> >  		goto out;
> 
> All "goto out" in this functions looks weird as it returns directly... so if
> you're touching this after all, I suggest we do "return -ENOMEM" directly and
> drop the "ret = -ENOMEM".

No strong feeling either way from me on that: whichever looks best
to you.  But I suspect the "ret = -ENOMEM" cannot be dropped,
because it's relied on further down too?

> 
> Thanks,
> 
> > +	}
> >  
> >  	if (!*pagep) {
> >  		page = shmem_alloc_page(gfp, info, pgoff);
> > -- 
> > 2.31.1.498.g6c1eba8ee3d-goog
> > 
> 
> -- 
> Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ