linux-kernel - Re: mm: mmap_sem lock assertion failure in __mlock_vma_pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1394569297.2786.36.camel@buesod1.americas.hpqcorp.net>
Date:	Tue, 11 Mar 2014 13:21:37 -0700
From:	Davidlohr Bueso <davidlohr@...com>
To:	Sasha Levin <sasha.levin@...cle.com>
Cc:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Michel Lespinasse <walken@...gle.com>,
	Rik van Riel <riel@...hat.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: mm: mmap_sem lock assertion failure in __mlock_vma_pages_range

On Tue, 2014-03-11 at 16:12 -0400, Sasha Levin wrote:
> On 03/11/2014 04:07 PM, Davidlohr Bueso wrote:
> > On Tue, 2014-03-11 at 15:39 -0400, Sasha Levin wrote:
> >> Hi all,
> >>
> >> I've ended up deleting the log file by mistake, but this bug does seem to be important
> >> so I'd rather not wait before the same issue is triggered again.
> >>
> >> The call chain is:
> >>
> >> 	mlock (mm/mlock.c:745)
> >> 		__mm_populate (mm/mlock.c:700)
> >> 			__mlock_vma_pages_range (mm/mlock.c:229)
> >> 				VM_BUG_ON(!rwsem_is_locked(&mm->mmap_sem));
> >
> > So __mm_populate() is only called by mlock(2) and this VM_BUG_ON seems
> > wrong as we call it without the lock held:
> >
> > 	up_write(&current->mm->mmap_sem);
> > 	if (!error)
> > 		error = __mm_populate(start, len, 0);
> > 	return error;
> > }
> >
> >>
> >> It seems to be a rather simple trace triggered from userspace. The only recent patch
> >> in the area (that I've noticed) was "mm/mlock: prepare params outside critical region".
> >> I've reverted it and trying to testing without it.
> >
> > Odd, this patch should definitely *not* cause this. In any case every
> > operation removed from the critical region is local to the function:
> >
> > 	lock_limit = rlimit(RLIMIT_MEMLOCK);
> > 	lock_limit >>= PAGE_SHIFT;
> > 	locked = len >> PAGE_SHIFT;
> >
> > 	down_write(&current->mm->mmap_sem);
> 
> Yeah, this patch doesn't look like it's causing it, I guess it was more of a "you touched this
> code last - do you still remember what's going on here?" :).

How frequently do you trigger this issue? Could you verify if it still
occurs by reverting my patch?

> It's semi-odd because it seems like an obvious issue to hit with trinity but it's the first time
> I've seen it and it's probably been there for a while (that BUG_ON is there from 2009).

Actually that VM_BUG_ON is correct, because we do in fact take the
mmap_sem (for reading) inside __mm_populate(), which in return calls
__mlock_vma_pages_range() with the lock held. Now, the lock is taken
within the for loop, which does the hole "if (!locked) down_read()"
dance, but it's just making sure that we take the lock upon the first
iteration. So besides doing the locking outside of the loop, which is
just a cleanup, I don't really see how it could be triggered.

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/