lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 16 Apr 2008 17:03:53 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Pekka Enberg <penberg@...helsinki.fi>,
	Christoph Lameter <clameter@....com>,
	linux-kernel@...r.kernel.org, Mel Gorman <mel@....ul.ie>,
	Nick Piggin <npiggin@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>, Yinghai.Lu@....com,
	apw@...dowen.org,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: [patch] mm: sparsemem memory_present() memory corruption fix


* Ingo Molnar <mingo@...e.hu> wrote:

> ps. anyone who can correctly guess the method with which i found the
>     exact place that corrupted memory will get a free beer next time 
>     we meet :-)

the method was to notice that the slub_debug_slabs SLUB variable got 
corrupted from an expected value of 0 to a value of 0x1.

Then i added a simple brute-force function-tracer hook (in sched-devel) 
that checked when slub_debug_slabs went from 0 to 1, and which then 
printed a backtrace.

Since under CONFIG_FTRACE=y every kernel function calls this callback, 
it triggered immediately after the value got corrupted:

[    0.000000] console [earlyser0] enabled
[    0.000000] BUG: slub_debug_slabs: 00000001
[    0.000000] Pid: 0, comm: swapper Not tainted 2.6.25-rc9-sched-devel.git-x86-latest.git #982
[    0.000000]  [<c0177fba>] print_slub_debug_slabs+0x3a/0x40
[    0.000000]  [<c01050f7>] trace+0x8/0x11
[    0.000000]  [<c0cc929e>] ? mtrr_bp_init+0xe/0x320
[    0.000000]  [<c01050f7>] ? trace+0x8/0x11
[    0.000000]  [<c0cd7369>] ? memory_present+0x9/0x50
[    0.000000]  [<c0cc7a09>] ? find_max_pfn+0x99/0xb0
[    0.000000]  [<c0cc6af7>] setup_arch+0x217/0x470
[    0.000000]  [<c012c59b>] ? printk+0x1b/0x20
[    0.000000]  [<c0cc2b46>] start_kernel+0x96/0x3f0
[    0.000000]  [<c0cc22fd>] i386_start_kernel+0xd/0x10
[    0.000000]  =======================
[    0.000000] x86: PAT support disabled.

and the backtrace had all the guilty parties on stack - memory_present() 
[which was just called] and find_max_pfn()/setup_arch() - thanks to the 
new fuzzy "?" backtrace entries we print out in v2.6.25.

(i could also have printed out the current ftrace buffer as well, 
showing the history of all recent function calls that the kernel 
executed.)

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ