linux-kernel - Re: BUG: Bad page map in process udevd (anon

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <m1pqqqfpzh.fsf@fess.ebiederm.org>
Date:	Thu, 17 Feb 2011 10:57:54 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Michal Hocko <mhocko@...e.cz>, linux-mm@...ck.org,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: BUG: Bad page map in process udevd (anon_vma: (null)) in 2.6.38-rc4

Ingo Molnar <mingo@...e.hu> writes:

> * Linus Torvalds <torvalds@...ux-foundation.org> wrote:
>
>> And in addition, I don't see why others wouldn't see it (I've got
>> DEBUG_PAGEALLOC and SLUB_DEBUG_ON turned on myself, and I know others
>> do too).
>
> I've done extensive randconfig testing and no crash triggers for typical workloads 
> on a typical dual-core PC. If there's a generic crashes in there my tests tend to 
> trigger them at least 10x as often as regular testers ;-) But the tests are still 
> only statistical so the race could simply be special and missed by the tests.
>
>> So I'm wondering what triggers it. Must be something subtle.
>
> I think what Michal did before he got the corruption seemed somewhat atypical: 
> suspend/resume and udevd wifi twiddling, right?
>
> Now, Eric's crashes look similar - and he does not seem to have done anything 
> special to trigger the crashes.
>
> Eric, could you possibly describe your system in a bit more detail, does it do 
> suspend and does the box use wifi actively? Anything atypical in your setup or usage 
> that doesnt match a bog-standard whitebox PC with LAN? Swap to file? NFS? FUSE? 
> Anything that is even just borderline atypical.

10G RAM
2G Swap
dual socket system
4 cores per socket
No hyperthreading.

fedora 14
ext4 on all filesystems

The biggest difference is I beat the system to death with automated builds.

I was about to say this happens with DEBUG_PAGEALLOC enabled but it
appears that options keeps eluding my fingers when I have a few minutes
to play with it.  Perhaps this time will be the charm.

The biggest difference may be that I am constantly stressing the system
to the edge of triggering the OOM killer.  My builds and tests are
greedy when it comes to memory.

I guess also I only see the bad PMD on processes that exit.  So it may
be that it is a matter of timing to see it.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/