linux-kernel - Re: 2.6.18-rc6-mm1: GPF loop on early boot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20060910140214.GA1578@elte.hu>
Date:	Sun, 10 Sep 2006 16:02:14 +0200
From:	Ingo Molnar <mingo@...e.hu>
To:	Andi Kleen <ak@...e.de>
Cc:	Laurent Riffard <laurent.riffard@...e.fr>,
	Arjan van de Ven <arjan@...radead.org>,
	Andrew Morton <akpm@...l.org>,
	Kernel development list <linux-kernel@...r.kernel.org>,
	Jeremy Fitzhardinge <jeremy@...source.com>
Subject: Re: 2.6.18-rc6-mm1: GPF loop on early boot

* Andi Kleen <ak@...e.de> wrote:

> > ugh, "having a working current" is "so much infrastructure" ??
> 
> Together with stacktrace the infrastructure needed is quite 
> considerable.

Having the ability to produce a backtrace _anywhere within the kernel_ 
is a pretty fundamental thing too. So IMO the problems with stacktrace 
were the stack dumping code's weaknesses, independent of lockdep, and 
they would have triggered in one way or another as well. Not that i 
think that doing a dwarf unwinder is simple - and you certainly did a 
great job with it. Now with stacktrace we do have a pretty powerful 
infrastructure to do transparent execution tracing: for example the 
page-tracing code could be changed to use stacktrace.

> > the i686 PDA patches introduce tons of early_current() uses. While i
> > like the new PDA code, its bootstrap (like x86_64's PDA bootstrap) is
> > too fragile in my opinion, and it will regularly hit instrumenting
> > patches.
> 
> Or the instrumentation patches just always need to check some global 
> variable. Maybe system_state could be extended or something.

maybe, but i find that unrobust too. Example: i debugged many 
early-bootup hangs via the use of an early_printk() based function 
execution tracer (a feature of the latency tracing patchset) - the lack 
of working 'current' would hit that code too.

Yes, it can all be worked around, but the fundamental weakness IMO is 
that we dont have a working 'current' in all situations.

> > should be moved into 32-bit early boot userspace code (like
> > compressed/misc.c) and it will thus not depend on any kernel
> > infrastructure.
> 
> Ok I guess it would make sense to add a i386_start_kernel to i386 and 
> initialize the boot PDA there. I would also move early_cpu_init into 
> there because that also avoids quite some mess later.

yes. And move all of this _outside_ the normal kernel build process, so 
that whatever ad-hoc or less ad-hoc instrumentation is added (be that 
PROFILE_UNLIKELY, function tracing, 'get_current' based hacks, irqtrace, 
lockdep or whatever) it wont be included in that code. That gets rid of
most of the dependency and bootstrap problems. It's hard enough already
to debug early boot code.

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/