linux-kernel - Re: [kerneloops] regression in 2.6.27 wrt "lock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.0810130908150.3288@nehalem.linux-foundation.org>
Date:	Mon, 13 Oct 2008 09:08:55 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Karel Zak <kzak@...hat.com>,
	Arjan van de Ven <arjan@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [kerneloops] regression in 2.6.27 wrt "lock_page" and the
 "hwclock" program



On Mon, 13 Oct 2008, Ingo Molnar wrote:
> 
> hm, i think the 64-bit case is the correct code, because in this 'init 
> task OOMs' case we do:
> 
> out_of_memory:
>         up_read(&mm->mmap_sem);
>         if (is_global_init(tsk)) {
>                 yield();
>                 down_read(&mm->mmap_sem);
> 
> note that we drop the mmap_sem, so in theory another thread of this same 
> MM could change the vma tree, and our 'vma' might not be valid anymore.

Hmm. Looks about right.

> It's probably not a real issue in practice because this is about PID 1, 
> so i doubt it really matters, but still.
> 
> So how about the patch below?

Ack. As long as we don't have two versions and the code is impossible to 
look at.

			Linus

> 
> 	Ingo
> 
> ---------------->
> >From 7b87da331b6ada44ccd5ffeedba76880c825d4fc Mon Sep 17 00:00:00 2001
> From: Ingo Molnar <mingo@...e.hu>
> Date: Mon, 13 Oct 2008 17:49:02 +0200
> Subject: [PATCH] x86/mm: unify init task OOM handling
> 
> Linus noticed that the "again:" versus "survive:" OOM logic for
> the init task was arbitrarily different.
> 
> The 64-bit codepath is the better one, because it correctly re-lookups
> the vma after having dropped the ->mmap_sem.
> 
> Signed-off-by: Ingo Molnar <mingo@...e.hu>
> ---
>  arch/x86/mm/fault.c |   15 ++++++---------
>  1 files changed, 6 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index ac2ad78..8bc5956 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -671,7 +671,8 @@ void __kprobes do_page_fault(struct pt_regs *regs, unsigned long error_code)
>  		goto bad_area_nosemaphore;
>  
>  again:
> -	/* When running in the kernel we expect faults to occur only to
> +	/*
> +	 * When running in the kernel we expect faults to occur only to
>  	 * addresses in user space.  All other faults represent errors in the
>  	 * kernel and should generate an OOPS.  Unfortunately, in the case of an
>  	 * erroneous fault occurring in a code path which already holds mmap_sem
> @@ -734,9 +735,6 @@ good_area:
>  			goto bad_area;
>  	}
>  
> -#ifdef CONFIG_X86_32
> -survive:
> -#endif
>  	/*
>  	 * If for any reason at all we couldn't handle the fault,
>  	 * make sure we exit gracefully rather than endlessly redo
> @@ -871,12 +869,11 @@ out_of_memory:
>  	up_read(&mm->mmap_sem);
>  	if (is_global_init(tsk)) {
>  		yield();
> -#ifdef CONFIG_X86_32
> -		down_read(&mm->mmap_sem);
> -		goto survive;
> -#else
> +		/*
> +		 * Re-lookup the vma - in theory the vma tree might
> +		 * have changed:
> +		 */
>  		goto again;
> -#endif
>  	}
>  
>  	printk("VM: killing process %s\n", tsk->comm);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/