lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:	Mon, 02 Aug 2010 23:00:13 -0700
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Dave Airlie <airlied@...il.com>
Cc:	Yinghai Lu <yinghai@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>
Subject: Re: oops in ioapic_write_entry

Dave Airlie <airlied@...il.com> writes:

> On Tue, Aug 3, 2010 at 1:26 PM, Eric W. Biederman <ebiederm@...ssion.com> wrote:
>> Dave Airlie <airlied@...il.com> writes:

>>> Okay el6log is from a RHEL6 2.6.32 kernel, but it should give a good
>>> baseline, the 2.6.35 oops even earlier with all those options and is
>>> in the second attachment.
>>
>> It appears we have a smoking gun:
>>
>> For some reason setup_IO_APIC_IRQS thinks we at least 2 io_apics,
>> but we have only setup 1 io_apic.  Since io_apics need a kmap entry
>> accessing an apic that hasn't been setup will definitely give a
>> page fault.  It sounds like something is stomping nr_ioapics.
>>
>> From: 2.6.35-debuglog
>> IOAPIC[0]: apic_id 8, version 17, address 0xfec00000, GSI 0-23
>> ....
>> IOAPIC[1]: Set routing entry (0-16 -> 0x51 -> IRQ 16 Mode:1 Active:1)
>>
>> Can we get your System.map of the failing kernel (so we can see what
>> is close to nr_ioapics), and could you add a print statement in
>> arch/x86/kernel/apic/io_apic:setup_IO_APIC_irqs to print nr_ioapics?
>>
>> I would be surprised if drm changes could have affected this.
>>
>
> Okay, from my debug addition it still only seems to have one ioapic

Thanks. I goofed reading that code.  I saw setup_IO_APIC_irq and made
the incorrect leap that said we came from setup_IO_APIC_irqs, when
in fact we are coming from io_apic_set_pci_routing.

So let's see can I figure out why we are getting a bad apic_id.

For that I need to track back to pirq_enable_irq, which leads
me to IO_APIC_get_PCI_irq_vector.  The likely canidate is that we
simply are not finding the apicid that is present in the mp_irqs
entry that we decided to return.  The patch below should add
appropriate debugging and fix the lookup 

The real difference appears to be that acpi is disabled where it
is not disabled in your reference kernel.

Dave can you verify this fixes the oops for you?

It would be nice if we didn't crash early in boot even without
acpi present.

Eric

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index e41ed24..e824e14 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1067,7 +1067,7 @@ static int pin_2_irq(int idx, int apic, int pin)
 int IO_APIC_get_PCI_irq_vector(int bus, int slot, int pin,
 				struct io_apic_irq_attr *irq_attr)
 {
-	int apic, i, best_guess = -1;
+	int i, best_guess = -1;
 
 	apic_printk(APIC_DEBUG,
 		    "querying PCI -> IRQ mapping bus:%d, slot:%d, pin:%d.\n",
@@ -1080,16 +1080,29 @@ int IO_APIC_get_PCI_irq_vector(int bus, int slot, int pin,
 	for (i = 0; i < mp_irq_entries; i++) {
 		int lbus = mp_irqs[i].srcbus;
 
-		for (apic = 0; apic < nr_ioapics; apic++)
-			if (mp_ioapics[apic].apicid == mp_irqs[i].dstapic ||
-			    mp_irqs[i].dstapic == MP_APIC_ALL)
-				break;
-
 		if (!test_bit(lbus, mp_bus_not_pci) &&
 		    !mp_irqs[i].irqtype &&
 		    (bus == lbus) &&
 		    (slot == ((mp_irqs[i].srcbusirq >> 2) & 0x1f))) {
-			int irq = pin_2_irq(i, apic, mp_irqs[i].dstirq);
+			int apic;
+			int irq;
+
+			/* Lookup the ioapic by id */
+			for (apic = 0; apic < nr_ioapics; apic++)
+				if (mp_ioapics[apic].apicid == mp_irqs[i].dstapic ||
+					mp_irqs[i].dstapic == MP_APIC_ALL)
+					break;
+
+			/* Verify we found the ioapic */
+			if (apic >= nr_ioapics) {
+				printk(KERN_ERR 
+					"%02x:%02x.%c: APIC_ID %u pin: %u not found BIOS bug?\n",
+					bus, slot, 'A' + pin - 1,
+					mp_irqs[i].dstapic, mp_irqs[i].dstirq);
+				continue;
+			}
+
+			irq = pin_2_irq(i, apic, mp_irqs[i].dstirq);
 
 			if (!(apic || IO_APIC_IRQ(irq)))
 				continue;
@@ -1099,7 +1112,8 @@ int IO_APIC_get_PCI_irq_vector(int bus, int slot, int pin,
 						     mp_irqs[i].dstirq,
 						     irq_trigger(i),
 						     irq_polarity(i));
-				return irq;
+				best_guess = irq;
+				goto out;
 			}
 			/*
 			 * Use the first all-but-pin matching entry as a
@@ -1114,6 +1128,12 @@ int IO_APIC_get_PCI_irq_vector(int bus, int slot, int pin,
 			}
 		}
 	}
+out:
+	if (best_guess >= 0)
+		apic_printk(APIC_DEBUG,
+			"%02x:%02x.%c: IRQ %u IOAPIC: %u pin: %u",
+			bus, slot, 'A' + pin - 1,
+			best_guess, irq_attr->ioapic, irq_attr->ioapic_pin);
 	return best_guess;
 }
 EXPORT_SYMBOL(IO_APIC_get_PCI_irq_vector);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ