lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 16 Apr 2008 15:05:13 +0100
From:	Mel Gorman <mel@....ul.ie>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Christoph Lameter <clameter@....com>,
	linux-kernel@...r.kernel.org, Nick Piggin <npiggin@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Yinghai.Lu@....com,
	apw@...dowen.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Subject: Re: [patch] mm: sparsemem memory_present() memory corruption fix

On (16/04/08 02:03), Ingo Molnar didst pronounce:
> 
> finally found it ... the patch below solves the sparsemem crash and the 
> testsystem boots up fine now:
> 
>   mars:~> uname -a
>   Linux mars 2.6.25-rc9-sched-devel.git-x86-latest.git #985 SMP Wed Apr 16 
>   01:37:37 CEST 2008 i686 i686 i386 GNU/Linux
> 
> yay! :-)
> 

Very cool :) This fixed the silent lock-up that I was getting when using
your config as well.

At a bit of a loss yesterday to explain what was going wrong, I had started
putting together patches to sanity check memory initialisation at various
different stages trying to catch where things were going pear-shaped. You
found the bug before it was done but I finished the basics anyway and posted
it as "[RFC] Verification and debugging of memory initialisation". Something
like it may help avoid similar headaches for people who tend to run into
(or cause) boot problems.

> ps. anyone who can correctly guess the method with which i found the 
>     exact place that corrupted memory will get a free beer next time we 
>     meet :-)
> 
> ------------------------->
> Subject: mm: sparsemem memory_present() memory corruption fix
> From: Ingo Molnar <mingo@...e.hu>
> Date: Wed Apr 16 01:40:00 CEST 2008
> 
> fix memory corruption and crash on 32-bit x86 systems.
> 
> if a !PAE x86 kernel is booted on a 32-bit system with more than
> 4GB of RAM, then we call memory_present() with a start/end that
> goes outside the scope of MAX_PHYSMEM_BITS.
> 
> that causes this loop to happily walk over the limit of the
> sparse memory section map:
> 
>     for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
>                 unsigned long section = pfn_to_section_nr(pfn);
>                 struct mem_section *ms;
> 
>                 sparse_index_init(section, nid);
>                 set_section_nid(section, nid);
> 
>                 ms = __nr_to_section(section);
>                 if (!ms->section_mem_map)
>                         ms->section_mem_map = sparse_encode_early_nid(nid) |
> 
> 'ms' will be out of bounds and we'll corrupt a small amount of memory by
> encoding the node ID. Depending on what that memory is, we might crash, 
> misbehave or just not notice the bug.
> 
> the fix is to sanity check anything the architecture passes to sparsemem.
> 
> this bug seems to be rather old (as old as sparsemem support itself),
> but the exact incarnation depended on random details like configs,
> which made this bug more prominent in v2.6.25-to-be.
> 
> an additional enhancement might be to print a warning about ignored
> or trimmed memory ranges.
> 
> Signed-off-by: Ingo Molnar <mingo@...e.hu>
> ---
>  mm/sparse.c |   10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> Index: linux/mm/sparse.c
> ===================================================================
> --- linux.orig/mm/sparse.c
> +++ linux/mm/sparse.c
> @@ -149,8 +149,18 @@ static inline int sparse_early_nid(struc
>  /* Record a memory area against a node. */
>  void __init memory_present(int nid, unsigned long start, unsigned long end)
>  {
> +	unsigned long max_arch_pfn = 1ULL << (MAX_PHYSMEM_BITS-PAGE_SHIFT);
>  	unsigned long pfn;
>  
> +	/*
> +	 * Sanity checks - do not allow an architecture to pass
> +	 * in larger pfns than the maximum scope of sparsemem:
> +	 */
> +	if (start >= max_arch_pfn)
> +		return;
> +	if (end >= max_arch_pfn)
> +		end = max_arch_pfn;
> +
>  	start &= PAGE_SECTION_MASK;
>  	for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
>  		unsigned long section = pfn_to_section_nr(pfn);
> 

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ