linux-kernel - Re: [patch] mm: recheck lock rlim after f

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0707131341230.17487@blonde.wat.veritas.com>
Date:	Fri, 13 Jul 2007 14:02:34 +0100 (BST)
From:	Hugh Dickins <hugh@...itas.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
cc:	Dmitry Monakhov <dmonakhov@...ru>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Christoph Hellwig <hch@...radead.org>,
	Nick Piggin <nickpiggin@...oo.com.au>
Subject: Re: [patch] mm: recheck lock rlim after f_op->mmap() method

On Fri, 13 Jul 2007, Andrew Morton wrote:
> On Mon, 9 Jul 2007 22:49:17 +0400 Dmitry Monakhov <dmonakhov@...ru> wrote:
> 
> > Some device drivers can change vm_flags in their f_op->mmap
> > method. In order to be on the safe side we have to recheck
> > lock rlimit. Now we have to check lock rlimit from two places,
> > let's move this common code to helper functon.

I think you've made a mistake putting this one into your -mm tree:
I thought we'd replaced it all by the one-liner to mspec.c
[PATCH] mspec_mmap don't set VM_LOCKED

There is no justification for a driver to set VM_LOCKED these days
(even in 2.4 they were supposed to set VM_RESERVED instead).  So
all this to handle a driver setting VM_LOCKED is inappropriate.

mspec was the only driver doing it in tree, but ATI's fglrx is the
one Dmitry spotted (and it takes care to adjust the locked_vm count,
so that won't even wrap negative later - the worst that happens is
they prevent the process using their driver from locking as much
memory as it might wish to - or that's how it looked to me).

The only debate is what we might do in the way of extra checks after
coming back from a driver's ->mmap.  WARN_ON(vma->vm_flags & VM_LOCKED)?
And which other flags?  hch is keen to put a check in there, I'm less
interested, and haven't begun to think of all the things we might wish
to check if we're going to distrust the driver here, and wherever else.

But that would be the way to go, not Dmitry's check_lock_limit patch:
please remove it from -mm after all.

One more comment below...

> > 
> > Signed-off-by: Dmitry Monakhov <dmonakhov@...nvz.org>
> > ---
> >  mm/mmap.c |   33 ++++++++++++++++++++++++++-------
> >  1 files changed, 26 insertions(+), 7 deletions(-)
> > 
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 906ed40..5c89f1d 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -885,6 +885,18 @@ void vm_stat_account(struct mm_struct *mm, unsigned long flags,
> >  }
> >  #endif /* CONFIG_PROC_FS */
> >  
> > +static int check_lock_limit(unsigned long delta, struct mm_struct* mm)
> > +{
> > +	unsigned long locked, lock_limit;
> > +	locked = delta >> PAGE_SHIFT;
> > +	locked += mm->locked_vm;
> > +	lock_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
> > +	lock_limit >>= PAGE_SHIFT;
> > +	if (locked > lock_limit && !capable(CAP_IPC_LOCK))
> > +		return -EAGAIN;
> > +	return 0;
> > +}
> > +
> >  /*
> >   * The caller must hold down_write(current->mm->mmap_sem).
> >   */
> > @@ -954,13 +966,9 @@ unsigned long do_mmap_pgoff(struct file * file, unsigned long addr,
> >  	}
> >  	/* mlock MCL_FUTURE? */
> >  	if (vm_flags & VM_LOCKED) {
> > -		unsigned long locked, lock_limit;
> > -		locked = len >> PAGE_SHIFT;
> > -		locked += mm->locked_vm;
> > -		lock_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
> > -		lock_limit >>= PAGE_SHIFT;
> > -		if (locked > lock_limit && !capable(CAP_IPC_LOCK))
> > -			return -EAGAIN;
> > +		error = check_lock_limit(len, mm);
> > +		if (error)
> > +			return error;
> >  	}
> >  
> >  	inode = file ? file->f_path.dentry->d_inode : NULL;
> > @@ -1101,6 +1109,17 @@ munmap_back:
> >  		error = file->f_op->mmap(file, vma);
> >  		if (error)
> >  			goto unmap_and_free_vma;
> > +		
> > +		if (vma->vm_flags & VM_LOCKED
> > +				&& !(vm_flags & VM_LOCKED)) {
> > +		/*
> > +		 * VM_LOCKED was added in f_op->mmap() method,
> > +		 * so we have to recheck limit.
> > +		 */
> > +			error = check_lock_limit(len, mm);
> > +			if (error)
> > +				goto unmap_and_free_vma;
> > +		}
> 
> Worried.  As far as the filesytem is concerned, its mmap has succeeded.
> 
> But now we're taking the unmap_and_free_vma path _after_ ->mmap() has
> "succeeded".  So we will now tell userspace that the mmap syscall has
> failed, even though the fs thinks it succeeded, if you follow me.  And this
> is a new thing.  
> 
> Could it cause bad things to happen?  Well, if filesystems had a
> file_operations.munmap() then yeah, we should have called that in your new
> code.  But filesystems don't have a ->munmap() method.
> 
> Still.  Can we think of any way in which this change could lead to resource
> leaks or to any other such problems?

All very good thoughts, but we don't even need to think them!

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/