linux-kernel - Re: Regression in linux 2.6.37: failure on remount / (ext4) rw

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <AANLkTimyKXfJ1x8tgwrr1hYnNLrPfgE1NTe4z7L6tUDm@mail.gmail.com>
Date:	Mon, 17 Jan 2011 07:32:10 -0500
From:	Brian Gerst <brgerst@...il.com>
To:	Matthias Merz <linux@...z-ka.de>
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>
Subject: Re: Regression in linux 2.6.37: failure on remount / (ext4) rw

On Fri, Jan 14, 2011 at 9:04 AM, Matthias Merz <linux@...z-ka.de> wrote:
> Hello,
>
> Am Mi, 12.01.2011 09:03 schrieb Pekka Enberg
>> On Tue, Jan 11, 2011 at 3:09 PM, Matthias Merz <linux@...z-ka.de> wrote:
>> > Am Di, 11.01.2011 09:50 schrieb Pekka Enberg
>> >> On Tue, Jan 11, 2011 at 12:31 AM, Matthias Merz <linux@...z-ka.de> wrote:
>> >> > This morning I tried vanilla 2.6.37 on my Desktop system, which
>> >> > failed to boot but continued displaying Debug-Messages too fast
>> >> > to read. Using netconsole I was then able to capture them [see
>> >> > below]. I was able to trigger this bug even with init=/bin/bash
>> >> > by a simple call of "mount -o remount,rw /" with my / being an
>> >> > ext4 filesystem.
>> > [erroneous bisecting] I assume some "hardware state" influeces
>> > triggering of this bug
>
>> Would it be possible for you to try to bisect it again? The oops you
>> report looks slightly obscure (at least to me) so it might be
>> difficult to find someone to fix it.
>
> Calling back after some time. Now I seem to have worked out a way to
> tell which versions are bad: After having booted a "good" version, a
> Power-down for a period of several minutes is needed (about 15 or so) or
> every version will be "good". So I checked by first booting a "known
> bad" 2.6.37. If that boot failed, I booted the version I wished to
> check, which seems to have produced usable results. So I was/am pretty
> convinced that something during "hardware setup" has changed which will
> survive a normal reset due to capacitances not fully discharged or
> something like that.
>
>
> git bisect now told me "22d4cd4c4dce6d7b7d9a7e396aa4f87fe7a649b1 is the
> first bad commit", which is titled: "x86-32: Allocate irq stacks
> seperate from percpu area".
>
> I reverted this change (and following 47f19a0814 due to #defines) and
> waited over the night until this morning. That revert really seems to
> fix my problem. So maybe in my special case something goes wrong with
> the new method?

Does this patch fix the problem?

Subject: [PATCH] x86: Clear irqstack thread_info

Make sure that the thread_info part of the irqstack is initialized
to zeroes.

Signed-off-by: Brian Gerst <brgerst@...il.com>
---
 arch/x86/kernel/irq_32.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c
index 48ff6dc..9974d21 100644
--- a/arch/x86/kernel/irq_32.c
+++ b/arch/x86/kernel/irq_32.c
@@ -129,8 +129,7 @@ void __cpuinit irq_ctx_init(int cpu)
 	irqctx = page_address(alloc_pages_node(cpu_to_node(cpu),
 					       THREAD_FLAGS,
 					       THREAD_ORDER));
-	irqctx->tinfo.task		= NULL;
-	irqctx->tinfo.exec_domain	= NULL;
+	memset(&irqctx->tinfo, 0, sizeof(struct thread_info));
 	irqctx->tinfo.cpu		= cpu;
 	irqctx->tinfo.preempt_count	= HARDIRQ_OFFSET;
 	irqctx->tinfo.addr_limit	= MAKE_MM_SEG(0);
@@ -140,10 +139,8 @@ void __cpuinit irq_ctx_init(int cpu)
 	irqctx = page_address(alloc_pages_node(cpu_to_node(cpu),
 					       THREAD_FLAGS,
 					       THREAD_ORDER));
-	irqctx->tinfo.task		= NULL;
-	irqctx->tinfo.exec_domain	= NULL;
+	memset(&irqctx->tinfo, 0, sizeof(struct thread_info));
 	irqctx->tinfo.cpu		= cpu;
-	irqctx->tinfo.preempt_count	= 0;
 	irqctx->tinfo.addr_limit	= MAKE_MM_SEG(0);

 	per_cpu(softirq_ctx, cpu) = irqctx;
-- 
1.7.3.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/