lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1d54r3cbk.fsf@ebiederm.dsl.xmission.com>
Date:	Sat, 03 Feb 2007 00:55:11 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Arjan van de Ven <arjan@...radead.org>
Cc:	Andrew Morton <akpm@...l.org>, linux-kernel@...r.kernel.org,
	"Lu, Yinghai" <yinghai.lu@....com>,
	Luigi Genoni <luigi.genoni@...elli.com>,
	Ingo Molnar <mingo@...e.hu>,
	Natalie Protasevich <protasnb@...il.com>,
	Andi Kleen <ak@...e.de>
Subject: Re: [PATCH 2/2] x86_64 irq:  Handle irqs pending in IRR during irq migration.

Arjan van de Ven <arjan@...radead.org> writes:

>> > Once the migration operation is complete we know we will receive
>> > no more interrupts on this vector so the irq pending state for
>> > this irq will no longer be updated.  If the irq is not pending and
>> > we are in the intermediate state we immediately free the vector,
>> > otherwise in we free the vector in do_IRQ when the pending irq
>> > arrives.
>> 
>> So is this a for-2.6.20 thing?  The bug was present in 2.6.19, so
>> I assume it doesn't affect many people?
>
> I got a few reports of this; irqbalance may trigger this kernel bug it
> seems... I would suggest to consider this for 2.6.20 since it's a
> hard-hang case


Yes.  The bug I fixed will not happen if you don't migrate irqs.

At the very least we want the patch below (already in -mm)
that makes it not a hard hang case.

Subject: [PATCH] x86_64:  Survive having no irq mapping for a vector

Occasionally the kernel has bugs that result in no irq being
found for a given cpu vector.  If we acknowledge the irq
the system has a good chance of continuing even though we dropped
an missed an irq message.  If we continue to simply print a
message and drop and not acknowledge the irq the system is
likely to become non-responsive shortly there after.

Signed-off-by: Eric W. Biederman <ebiederm@...ssion.com>
---
 arch/x86_64/kernel/irq.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 0c06af6..648055a 100644
--- a/arch/x86_64/kernel/irq.c
+++ b/arch/x86_64/kernel/irq.c
@@ -120,9 +120,14 @@ asmlinkage unsigned int do_IRQ(struct pt_regs *regs)
 
 	if (likely(irq < NR_IRQS))
 		generic_handle_irq(irq);
-	else if (printk_ratelimit())
-		printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n",
-			__func__, smp_processor_id(), vector);
+	else {
+		if (!disable_apic)
+			ack_APIC_irq();
+
+		if (printk_ratelimit())
+			printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n",
+				__func__, smp_processor_id(), vector);
+	}
 
 	irq_exit();
 
-- 
1.4.4.1.g278f

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ