[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.10.0808231257310.3363@nehalem.linux-foundation.org>
Date: Sat, 23 Aug 2008 13:10:45 -0700 (PDT)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: "Rafael J. Wysocki" <rjw@...k.pl>
cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Kernel Testers List <kernel-testers@...r.kernel.org>,
"Alan D. Brunelle" <Alan.Brunelle@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
Arjan van de Ven <arjan@...ux.intel.com>,
Rusty Russell <rusty@...tcorp.com.au>
Subject: Re: [Bug #11342] Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c -
bisected
On Sat, 23 Aug 2008, Rafael J. Wysocki wrote:
>
> The following bug entry is on the current list of known regressions
> from 2.6.26. Please verify if it still should be listed and let me know
> (either way).
>
>
> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=11342
> Subject : Linux 2.6.27-rc3: kernel BUG at mm/vmalloc.c - bisected
> Submitter : Alan D. Brunelle <Alan.Brunelle@...com>
> Date : 2008-08-13 23:03 (11 days old)
> References : http://marc.info/?l=linux-kernel&m=121866876027629&w=4
> Handled-By : Andrew Morton <akpm@...ux-foundation.org>
This one makes no sense. It's triggering a BUG_ON(in_interrupt()), but
then the call chain shows that there is no interrupt going on.
Also, the bisection is senseless - there's a trivial change wrt
"do_one_initcall()" that got merged, but everything else is trivial about
lguest and has nothing to do with the whole CPU-init thing. But if it was
that initcall one, then "git bisect" woul have pointed to it, not the
merge. And the merge itself had no conflicts or anything else going on..
The fact that it came and went later also implies that it's probably just
some timing-dependent thing or some subtle memory corruption, making the
bisection result even less likely to be exact.
But I'm adding Arjan and Rusty to the Cc, because that merge was takign
Rusty's branch, and the "do_one_initcall()" is Arjan's commit. Since
undoing that merge apparently does fix it, I'm wondering if something
there just does end up triggering the problem.
The do_one_commit() thing _is_ in the path of sys_init_module(), so it
_is_ at least somewhat relevant from an oops standpoint.
One thing the "do_one_commit()" thing does is to put more pressure on the
stack due to that whole buffer for the printk's going on.
Alan, can you try
- seeing how consistent it is with one kernel (ie boot a known-bad kernel
a few times just to see if it really is 100% consistent)
- try enabling 'initcall_debug' on the kernel command line, to (a) see
the new code actually do something and (b) see what it is actually
calling just before.
Hmm..
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists