lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Thu, 14 Apr 2011 15:28:52 -0400
From:	Paul Gortmaker <paul.gortmaker@...driver.com>
To:	Oleg Nesterov <oleg@...hat.com>
CC:	stable@...nel.org, linux-kernel@...r.kernel.org,
	stable-review@...nel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [34-longterm 136/209] exec: make argv/envp memory visible to
 oom-killer

On 11-04-14 02:19 PM, Oleg Nesterov wrote:
> On 04/14, Paul Gortmaker wrote:
>>
>>   =====================================================================
>>   | This is a commit scheduled for the next v2.6.34 longterm release. |
>>   | If you see a problem with using this for longterm, please comment.|
>>   =====================================================================
>> ...
>>
>> With this patch get_arg_page() increments current's MM_ANONPAGES
>> counter every time we allocate the new page for argv/envp. When
>> do_execve() succeds or fails, we change this counter back.
> 
> This only works starting from 2.6.36.
> 
> before 2.6.36 kernel, oom-killer's badness() uses mm->total_vm. Please
> see the patch for pre2.6.36 kernel below.

Thanks Oleg -- I see the earlier discussion about this from December
now, and will swap in the pre-36 version -- I appreciate the review.

Paul.

> 
> Oleg.
> 
> --------------------------------------------------------------------
> commit 3c77f845722158206a7209c45ccddc264d19319c upstream.
> 
> Brad Spengler published a local memory-allocation DoS that
> evades the OOM-killer (though not the virtual memory RLIMIT):
> http://www.grsecurity.net/~spender/64bit_dos.c
> 
> execve()->copy_strings() can allocate a lot of memory, but
> this is not visible to oom-killer, nobody can see the nascent
> bprm->mm and take it into account.
> 
> With this patch get_arg_page() increments current's MM_ANONPAGES
> counter every time we allocate the new page for argv/envp. When
> do_execve() succeds or fails, we change this counter back.
> 
> Technically this is not 100% correct, we can't know if the new
> page is swapped out and turn MM_ANONPAGES into MM_SWAPENTS, but
> I don't think this really matters and everything becomes correct
> once exec changes ->mm or fails.
> 
> Compared to upstream:
> 
> 	before 2.6.36 kernel, oom-killer's badness() takes
> 	mm->total_vm into account and nothing else. So
> 	acct_arg_size() has to play with this counter too.
> 
> Reported-by: Brad Spengler <spender@...ecurity.net>
> Reviewed-and-discussed-by: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
> Signed-off-by: Oleg Nesterov <oleg@...hat.com>
> Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
> 
> ---
> 
>  include/linux/binfmts.h |    1 +
>  fs/exec.c               |   28 ++++++++++++++++++++++++++--
>  2 files changed, 27 insertions(+), 2 deletions(-)
> 
> --- 2.6.35/include/linux/binfmts.h~1_acct_exec_mem	2010-03-11 13:11:50.000000000 +0100
> +++ 2.6.35/include/linux/binfmts.h	2010-12-13 12:01:22.000000000 +0100
> @@ -29,6 +29,7 @@ struct linux_binprm{
>  	char buf[BINPRM_BUF_SIZE];
>  #ifdef CONFIG_MMU
>  	struct vm_area_struct *vma;
> +	unsigned long vma_pages;
>  #else
>  # define MAX_ARG_PAGES	32
>  	struct page *page[MAX_ARG_PAGES];
> --- 2.6.35/fs/exec.c~1_acct_exec_mem	2010-05-28 13:41:40.000000000 +0200
> +++ 2.6.35/fs/exec.c	2010-12-13 12:00:51.000000000 +0100
> @@ -158,6 +158,21 @@ out:
>  
>  #ifdef CONFIG_MMU
>  
> +static void acct_arg_size(struct linux_binprm *bprm, unsigned long pages)
> +{
> +	struct mm_struct *mm = current->mm;
> +	long diff = (long)(pages - bprm->vma_pages);
> +
> +	if (!mm || !diff)
> +		return;
> +
> +	bprm->vma_pages = pages;
> +
> +	down_write(&mm->mmap_sem);
> +	mm->total_vm += diff;
> +	up_write(&mm->mmap_sem);
> +}
> +
>  static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>  		int write)
>  {
> @@ -180,6 +195,8 @@ static struct page *get_arg_page(struct 
>  		unsigned long size = bprm->vma->vm_end - bprm->vma->vm_start;
>  		struct rlimit *rlim;
>  
> +		acct_arg_size(bprm, size / PAGE_SIZE);
> +
>  		/*
>  		 * We've historically supported up to 32 pages (ARG_MAX)
>  		 * of argument strings even with small stacks
> @@ -270,6 +287,10 @@ static bool valid_arg_len(struct linux_b
>  
>  #else
>  
> +static inline void acct_arg_size(struct linux_binprm *bprm, unsigned long pages)
> +{
> +}
> +
>  static struct page *get_arg_page(struct linux_binprm *bprm, unsigned long pos,
>  		int write)
>  {
> @@ -977,6 +998,7 @@ int flush_old_exec(struct linux_binprm *
>  	/*
>  	 * Release all of the old mmap stuff
>  	 */
> +	acct_arg_size(bprm, 0);
>  	retval = exec_mmap(bprm->mm);
>  	if (retval)
>  		goto out;
> @@ -1401,8 +1423,10 @@ int do_execve(char * filename,
>  	return retval;
>  
>  out:
> -	if (bprm->mm)
> -		mmput (bprm->mm);
> +	if (bprm->mm) {
> +		acct_arg_size(bprm, 0);
> +		mmput(bprm->mm);
> +	}
>  
>  out_file:
>  	if (bprm->file) {
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ