linux-kernel - Re: [PATCH] mm: rearrange exit_mmap() to unlock before arch_exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <1234302305.6200.117.camel@lts-notebook>
Date:	Tue, 10 Feb 2009 16:45:05 -0500
From:	Lee Schermerhorn <Lee.Schermerhorn@...com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-kernel@...r.kernel.org, stable@...nel.org, jeremy@...p.org,
	keir.fraser@...citrix.com, christophe@...ut.de,
	alex.williamson@...com, npiggin@...e.de
Subject: Re: [PATCH] mm: rearrange exit_mmap() to unlock before
	arch_exit_mmap

On Tue, 2009-02-10 at 13:31 -0800, Andrew Morton wrote:
> On Mon, 09 Feb 2009 12:29:48 -0500
> Lee Schermerhorn <Lee.Schermerhorn@...com> wrote:
> 
> > From: Jeremy Fitzhardinge <jeremy@...p.org>
> > 
> > Subject: mm: rearrange exit_mmap() to unlock before arch_exit_mmap
> > 
> > Applicable to 29-rc4 and 28-stable
> > 
> > Christophe Saout reported [in precursor to:
> > http://marc.info/?l=linux-kernel&m=123209902707347&w=4]:
> > 
> > > Note that I also some a different issue with CONFIG_UNEVICTABLE_LRU.
> > > Seems like Xen tears down current->mm early on process termination, so
> > > that __get_user_pages in exit_mmap causes nasty messages when the
> > > process had any mlocked pages.  (in fact, it somehow manages to get into
> > > the swapping code and produces a null pointer dereference trying to get
> > > a swap token)
> > 
> > Jeremy explained:
> > 
> > Yes.  In the normal case under Xen, an in-use pagetable is "pinned", 
> > meaning that it is RO to the kernel, and all updates must go via 
> > hypercall (or writes are trapped and emulated, which is much the same 
> > thing).  An unpinned pagetable is not currently in use by any process, 
> > and can be directly accessed as normal RW pages.
> > 
> > As an optimisation at process exit time, we unpin the pagetable as early 
> > as possible (switching the process to init_mm), so that all the normal 
> > pagetable teardown can happen with direct memory accesses.
> > 
> > This happens in exit_mmap() -> arch_exit_mmap().  The munlocking happens 
> > a few lines below.  The obvious thing to do would be to move 
> > arch_exit_mmap() to below the munlock code, but I think we'd want to 
> > call it even if mm->mmap is NULL, just to be on the safe side.
> > 
> > Thus, this patch:
> > 
> > exit_mmap() needs to unlock any locked vmas before calling
> > arch_exit_mmap, as the latter may switch the current mm to init_mm,
> > which would cause the former to fail.
> > 
> > Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@...rix.com>
> > Acked-by:  Lee Schermerhorn <lee.schermerhorn@...com>
> > 
> > ---
> >  mm/mmap.c |   10 ++++++----
> >  1 file changed, 6 insertions(+), 4 deletions(-)
> > 
> > ===================================================================
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -2078,12 +2078,8 @@
> >  	unsigned long end;
> >  
> >  	/* mm's last user has gone, and its about to be pulled down */
> > -	arch_exit_mmap(mm);
> >  	mmu_notifier_release(mm);
> >  
> > -	if (!mm->mmap)	/* Can happen if dup_mmap() received an OOM */
> > -		return;
> > -
> >  	if (mm->locked_vm) {
> >  		vma = mm->mmap;
> >  		while (vma) {
> > @@ -2092,7 +2088,13 @@
> >  			vma = vma->vm_next;
> >  		}
> >  	}
> > +
> > +	arch_exit_mmap(mm);
> > +
> >  	vma = mm->mmap;
> > +	if (!vma)	/* Can happen if dup_mmap() received an OOM */
> > +		return;
> > +
> >  	lru_add_drain();
> >  	flush_cache_mm(mm);
> >  	tlb = tlb_gather_mmu(mm, 1);
> 
> The patch as it stands doesn't apply cleanly to 2.6.28.  I didn't look
> into what needs to be done to fix it up.  Presumably the stable beavers
> would like a fixed-up and tested version for backporting sometime.

The only difference I can see in this area, looking at a 28.2 tree I
have, the 28.4 patch on kernel.org and the context in the patch, is that
"flush_cache_mm(mm);" line.

Lee



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/