lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AANLkTikKT9dE2FVDxroo1y5iEcsvx54e4JFbjFlWxcPM@mail.gmail.com>
Date:	Fri, 16 Jul 2010 09:29:42 +0200
From:	Zeno Davatz <zdavatz@...il.com>
To:	Pekka Enberg <penberg@...helsinki.fi>
Cc:	Damien Wyart <damien.wyart@...e.fr>,
	Catalin Marinas <catalin.marinas@....com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"x86@...nel.org" <x86@...nel.org>, "mingo@...e.hu" <mingo@...e.hu>,
	"yinghai@...nel.org" <yinghai@...nel.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: kmemleak, cpu usage jump out of nowhere

On Thu, Jul 15, 2010 at 10:50 PM, Pekka Enberg <penberg@...helsinki.fi> wrote:
> On Thu, Jul 15, 2010 at 11:38 PM, Zeno Davatz <zdavatz@...il.com> wrote:
>> Am 15.07.2010 um 22:00 schrieb Damien Wyart <damien.wyart@...e.fr>:
>>
>>>>> For now, I can't reproduce the problem with CONFIG_NO_BOOTMEM disabled ;
>>>>> with the option and rc5 the problem was happening quite quickly after
>>>>> boot and normal use of the machine. So it seems I can confirme what Zeno
>>>>> has seen and I hope this will give a hint to debug the problem. I guess
>>>>> this has not been reported that much because many testers might not have
>>>>> enabled CONFIG_NO_BOOTMEM... Maybe the scheduler folks could test their
>>>>> benchmark with a kernel having this option enabled?
>>>
>>> * Pekka Enberg <penberg@...helsinki.fi> [2010-07-15 22:50]:
>>>> To be honest, the bug is bit odd. It's related to boot-time memory
>>>> allocator changes but yet it seems to manifest itself as a scheduling
>>>> problem. So if you have some spare time and want to speed up the
>>>> debugging process, please test v2.6.34 and v2.6.35-rc1 with
>>>> CONFIG_NO_BOOTMEM and if former is good and latter is bad, try to see
>>>> if you can identify the offending commit with "git bisect."
>>>
>>> Not sure I will have enough time in the coming days (doing that remotely
>>> is fishy since ssh access is almost stuck when the problem occurs); if
>>> Zeno can and would like to do it, maybe this could be done faster.
>>>
>>> As the scheduler is now very well instrumented (many debugging features
>>> are available), reproducing the bug on a test platform (it happens quite
>>> quickly for me) might also give some hints. So testers, if you have
>>> time, please test 2.6.35-rc5 with CONFIG_NO_BOOTMEM on a Core i7 and see
>>> if you can reproduce the problem!
>>
>> Will try to do so. Can you point me to the git bisect howto with the versions you want.
>
> Cool. So like I said, you first want to test 2.6.34 to find a known
> good version. Please remember to make sure you have CONFIG_NO_BOOTMEM
> enabled. You can also try to speed up the process by testing
> 2.6.35-rc1 which is likely to include the offending commit. That's not
> strictly necessary as long as you are sure that you have some
> 2.6.35-rc kernel that's bad.
>
> After that, bisecting is as simple as:
>
>  git bisect start
>  git bisect good v2.6.34
>  git bisect bad v2.6.31-rc1 # or some other kernel you know to be bad
>  <compile, boot, and try to trigger the problem>
>
> then
>
>  git bisect bad # if you were able to trigger the problem
>
> or
>
>  git bisect good # if the problem doesn't exist
>
> git will then find the next revision to test after which you do
>
>  <compile, boot, and try to trigger the problem>
>
> and repeat the "git bisect good/bad" step until git tells you it has
> found the offending commit.
>
> There's more information on the git bisect man pages:
>
> http://www.kernel.org/pub/software/scm/git/docs/git-bisect.html
>
> Let me know if you need more help with this.

This one also causes a panic:

http://www.flickr.com/photos/zrr/4798092747/in/photostream/

but this version boots just fine again:

Linux zenogentoo 2.6.34-05459-gac3ee84 #102 SMP Fri Jul 16 09:22:25
CEST 2010 i686 Intel(R) Core(TM) i7 CPU 960 @ 3.20GHz GenuineIntel
GNU/Linux

Best
Zeno
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ