[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20061006190039.GN2365@n6014avq19270.qlogic.org>
Date: Fri, 6 Oct 2006 12:00:39 -0700
From: Andrew Vasquez <andrew.vasquez@...gic.com>
To: Muli Ben-Yehuda <muli@...ibm.com>
Cc: "Eric W. Biederman" <ebiederm@...ssion.com>,
Ingo Molnar <mingo@...e.hu>,
Thomas Gleixner <tglx@...utronix.de>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Rajesh Shah <rajesh.shah@...el.com>, Andi Kleen <ak@....de>,
"Protasevich, Natalie" <Natalie.Protasevich@...SYS.com>,
"Luck, Tony" <tony.luck@...el.com>, Andrew Morton <akpm@...l.org>,
Linus Torvalds <torvalds@...l.org>,
Linux-Kernel <linux-kernel@...r.kernel.org>,
Badari Pulavarty <pbadari@...il.com>
Subject: Re: 2.6.19-rc1 genirq causes either boot hang or "do_IRQ: cannot handle IRQ -1"
On Fri, 06 Oct 2006, Muli Ben-Yehuda wrote:
> On Fri, Oct 06, 2006 at 05:50:21PM +0200, Muli Ben-Yehuda wrote:
>
> > > What happens if you boot with max_cpus=1?
> >
> > Trying it now... woohoo, it boots all the way and stays up!
>
> Ok, after verifying that maxcpus=1 causes the problematic changeset to
> boot, I also tried maxcpus=1 with the tip of the tree. I hit this NULL
> pointer dereference in profile_tick, with and without
> maxcpus=1. Disassembly says that get_irq_regs() is returning NULL,
> which may or may not be related to the genirq issue.
>
> kernel (hd0,1)/boot/calgary/bzImage root=/dev/sda2 console=tty0 console=ttyS1,1 9200 maxcpus=1
> [Linux-bzImage, setup=0x1c00, size=0x2e44df]
> initrd (hd0,1)/boot/calgary/aic94xxfw.initramfs.gz
> [Linux-initrd @ 0x37e3f000, 0x1b0188 bytes] savedefault
>
> [ 0.000000] Linux version 2.6.19-rc1mx (muli@...n) (gcc version 3.4.1) #154 S MP Fri Oct 6 17:57:51 IST 2006
> [ 0.000000] Command line: root=/dev/sda2 console=tty0 console=ttyS1,19200 max cpus=1
> [ 0.000000] BIOS-provided physical RAM map:
...
> [ 169.111284] Memory: 6096436k/6684672k available (3788k kernel code, 193708k r eserved, 2727k data, 276k init)
> [ 169.249201] Calibrating delay using timer specific routine.. 6346.40 BogoMIPS (lpj=12692802)
> [ 169.300193] Mount-cache hash table entries: 256
> [ 169.329043] CPU: Trace cache: 12K uops, L1 D cache: 16K
> [ 169.360565] CPU: L2 cache: 1024K
> [ 169.379968] using mwait in idle threads.
> [ 169.403574] CPU: Physical Processor ID: 0
> [ 169.427697] CPU: Processor Core ID: 0
> [ 169.449745] CPU0: Thermal monitoring enabled (TM1)
> [ 169.478556] Freeing SMP alternatives: 32k freed
> [ 169.505811] ACPI: Core revision 20060707
> [ 169.576566] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> [ 169.651600] Using local APIC timer interrupts.
> [ 169.709847] result 10425453
> [ 169.726643] Detected 10.425 MHz APIC timer.
> [ 169.753344] Brought up 1 CPUs
> [ 169.771342] Unable to handle kernel NULL pointer dereference at 0000000000000 088 RIP:
> [ 169.804240] [<ffffffff8022de57>] profile_tick+0x34/0x6a
> [ 169.851061] PGD 0
> [ 169.863259] Oops: 0000 [1] SMP
> [ 169.882391] CPU 0
> [ 169.894607] Modules linked in:
> [ 169.913117] Pid: 1, comm: swapper Not tainted 2.6.19-rc1mx #154
> [ 169.948655] RIP: 0010:[<ffffffff8022de57>] [<ffffffff8022de57>] profile_tick +0x34/0x6a
> [ 169.996876] RSP: 0000:ffffffff808d8f78 EFLAGS: 00010046
> [ 170.028766] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [ 170.071615] RDX: ffff8100893f5f00 RSI: 0000000000000000 RDI: 0000000000000001
> [ 170.114451] RBP: ffffffff808d8f88 R08: 0000000000000002 R09: ffffffff8022d24a
> [ 170.157290] R10: ffffffff8022d24a R11: ffffffff80732780 R12: 0000000000000001
> [ 170.200134] R13: ffffffff808e8d75 R14: 0000000000000246 R15: 0000000000000012
> [ 170.242975] FS: 0000000000000000(0000) GS:ffffffff8085e000(0000) knlGS:00000 00000000000
> [ 170.291576] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [ 170.326102] CR2: 0000000000000088 CR3: 0000000000201000 CR4: 00000000000006e0
> [ 170.368947] Process swapper (pid: 1, threadinfo ffff810197c7c000, task ffff81 0197c67040)
> [ 170.417545] Stack: ffffffff807327c0 0000000000000012 ffffffff808d8f98 ffffff ff8021507e
> [ 170.466126] upt+0xe/0x54
> [ 170.616246] [<ffffffff80215108>] smp_apic_timer_interrupt+0x44/0x4b
> [ 170.654411] [<ffffffff8020a4fb>] apic_timer_interrupt+0x6b/0x70
> [ 170.690486] <EOI> [<ffffffff8022d24a>] release_console_sem+0x47/0x200
> [ 170.730351] [<ffffffff8022d24a>] release_console_sem+0x47/0x200
> [ 18a65>] atomic_notifier_chain_register+0x33/0x3e
> [ 170.937284] [<ffffffff808a3359>] spawn_softlockup_task+0x6a/0x6f
> [ 170.973876] [<ffffffff80207116>] init+0xce/0x30c
> [ 171.002152] [<ffffffff805afd00>] trace_hardirqs_on_thunk+0x35/0x37
> [ 171.039796] [<ffffffff80244952>] trace_hardirqs_on+0xf6/0x11a
> [ 171.074833] [<ffffffff8020a6e5>] child_rip+0xa/0x15
> [ 171.104669] [<ffffffff805b0484>] _spin_unlock_irq+0x29/0x2f
> [ 171.138667] [<ffffffff80209e5d>] restore_args+0x0/0x30
> [ 171.170065] [<ffffffff80207048>] init+0x0/0x30c
> [ 171.197821] [<ffffffff8020a6db>] child_rip+0x0/0x15
Hmm, I'm seeing a similar boot-up panic on my x86_64 box. Here's the
boot-output (.config is attached).
[ 0.000000] Linux version 2.6.19-rc1 (root@spe) (gcc version 4.1.0 (SUSE Linux)) #5 SMP Fri Oct 6 09:49:14 PDT 2006
[ 0.000000] Command line: root=/dev/sda2 vga=1 resume=/dev/sda1 console=ttyS0,115200 console=tty0 nmi_watchdog=1
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009b800 (usable)
[ 0.000000] BIOS-e820: 000000000009b800 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000d0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 000000003ff10000 (usable)
[ 0.000000] BIOS-e820: 000000003ff10000 - 000000003ff17000 (ACPI data)
[ 0.000000] BIOS-e820: 000000003ff17000 - 000000003ff80000 (ACPI NVS)
[ 0.000000] BIOS-e820: 000000003ff80000 - 0000000040000000 (reserved)
[ 0.000000] BIOS-e820: 00000000e0000000 - 00000000e8000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fec00000 - 00000000fec00400 (reserved)
[ 0.000000] BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[ 0.000000] BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[ 0.000000] end_pfn_map = 1048576
[ 0.000000] DMI present.
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0 -> 4096
[ 0.000000] DMA32 4096 -> 1048576
[ 0.000000] Normal 1048576 -> 1048576
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0 -> 155
[ 0.000000] 0: 256 -> 261904
[ 0.000000] Nvidia board detected. Ignoring ACPI timer override.
[ 0.000000] ACPI: PM-Timer IO Port: 0x8008
[ 0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[ 0.000000] Processor #0 (Bootup-CPU)
[ 0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[ 0.000000] Processor #1
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[ 0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[ 0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[ 0.000000] IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
[ 0.000000] ACPI: IOAPIC (id[0x03] address[0xdf300000] gsi_base[24])
[ 0.000000] IOAPIC[1]: apic_id 3, address 0xdf300000, GSI 24-27
[ 0.000000] ACPI: IOAPIC (id[0x04] address[0xdf301000] gsi_base[28])
[ 0.000000] IOAPIC[2]: apic_id 4, address 0xdf301000, GSI 28-31
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
[ 0.000000] ACPI: BIOS IRQ0 pin2 override ignored.
[ 0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[ 0.000000] Setting APIC routing to flat
[ 0.000000] Using ACPI (MADT) for SMP configuration information
[ 0.000000] Nosave address range: 000000000009b000 - 000000000009c000
[ 0.000000] Nosave address range: 000000000009c000 - 00000000000a0000
[ 0.000000] Nosave address range: 00000000000a0000 - 00000000000d0000
[ 0.000000] Nosave address range: 00000000000d0000 - 0000000000100000
[ 0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:a0000000)
[ 0.000000] PERCPU: Allocating 32512 bytes of per cpu data
[ 0.000000] Built 1 zonelists. Total pages: 256711
[ 0.000000] Kernel command line: root=/dev/sda2 vga=1 resume=/dev/sda1 console=ttyS0,115200 console=tty0 nmi_watchdog=1
[ 0.000000] Initializing CPU#0
[ 0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[ 26.923435] Console: colour VGA+ 80x50
[ 27.230350] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[ 27.238291] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[ 27.245305] Checking aperture...
[ 27.248584] CPU 0: aperture @ 0 size 32 MB
[ 27.255250] No AGP bridge found
[ 27.276817] Memory: 1024988k/1047616k available (2405k kernel code, 22060k reserved, 955k data, 224k init)
[ 27.366087] Calibrating delay using timer specific routine.. 4423.17 BogoMIPS (lpj=8846344)
[ 27.374753] Mount-cache hash table entries: 256
[ 27.379624] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[ 27.386804] CPU: L2 Cache: 1024K (64 bytes/line)
[ 27.391486] Freeing SMP alternatives: 24k freed
[ 27.396100] ACPI: Core revision 20060707
[ 27.445859] activating NMI Watchdog ... done.
[ 27.450314] Using local APIC timer interrupts.
[ 27.500022] result 12557931
[ 27.502863] Detected 12.557 MHz APIC timer.
[ 27.510539] Booting processor 1/2 APIC 0x1
[ 27.514684] Unable to handle kernel NULL pointer dereference at 0000000000000088 RIP:
[ 27.520204] [<ffffffff80225fb0>] profile_tick+0x40/0x90
[ 27.528118] PGD 0
[ 27.530222] Oops: 0000 [1] SMP
[ 27.533505] CPU 0
[ 27.535610] Modules linked in:
[ 27.538755] Pid: 1, comm: swapper Not tainted 2.6.19-rc1 #5
[ 27.544367] RIP: 0010:[<ffffffff80225fb0>] [<ffffffff80225fb0>] profile_tick+0x40/0x90
[ 27.552483] RSP: 0000:ffffffff8059ff78 EFLAGS: 00010046
[ 27.557842] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
[ 27.565024] RDX: ffff810081a77f40 RSI: 0000000000000046 RDI: 0000000000000001
[ 27.572203] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000007
[ 27.579383] R10: 0000000000000002 R11: ffffffff8032c8a0 R12: ffffffff805ae6a1
[ 27.586563] R13: 0000000000000012 R14: 0000000000000031 R15: 0000000000000246
[ 27.593744] FS: 0000000000000000(0000) GS:ffffffff80549000(0000) knlGS:0000000000000000
[ 27.601892] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 27.607678] CR2: 0000000000000088 CR3: 0000000000201000 CR4: 00000000000006e0
[ 27.614859] Process swapper (pid: 1, threadinfo ffff8100021fe000, task ffff8100021ee740)
[ 27.623008] Stack: ffffffff8059a6e0 ffffffff804e2700 ffff8100021ffb30 ffffffff80215c7e
[ 27.631313] 0000000000bf9e6b ffffffff802161b5 0000000000000000 ffffffff8020a6f6
[ 27.638953] ffff8100021ffb30 <EOI> 0000000000000000 ffffffff8032c8a0 0000000000000002
[ 27.647016] Call Trace:
[ 27.649745] <IRQ> [<ffffffff80215c7e>] smp_local_timer_interrupt+0xe/0x60
[ 27.656803] [<ffffffff802161b5>] smp_apic_timer_interrupt+0x35/0x40
[ 27.663205] [<ffffffff8020a6f6>] apic_timer_interrupt+0x66/0x70
[ 27.669256] <EOI> [<ffffffff8032c8a0>] vgacon_cursor+0x0/0x1c8
[ 27.675364] [<ffffffff802252fd>] vprintk+0x2fd/0x350
[ 27.680462] [<ffffffff805755a5>] init_idle+0x95/0xb0
[ 27.685558] [<ffffffff8022539e>] printk+0x4e/0x60
[ 27.690391] [<ffffffff8021ce7f>] complete+0x3f/0x60
[ 27.695396] [<ffffffff8056ed8f>] __cpu_up+0x40f/0x7d0
[ 27.700575] [<ffffffff80215a70>] do_fork_idle+0x0/0x20
[ 27.705843] [<ffffffff804573cf>] __mutex_lock_slowpath+0x1df/0x1f0
[ 27.712157] [<ffffffff802407f2>] cpu_up+0xa2/0x120
[ 27.717082] [<ffffffff802070bb>] init+0x9b/0x330
[ 27.721830] [<ffffffff804585d9>] _spin_unlock_irq+0x9/0x10
[ 27.727451] [<ffffffff802209bc>] schedule_tail+0x4c/0xc0
[ 27.732897] [<ffffffff8020a8e5>] child_rip+0xa/0x15
[ 27.737906] [<ffffffff8033171e>] acpi_ds_init_one_object+0x0/0x82
[ 27.744131] [<ffffffff80207020>] init+0x0/0x330
[ 27.748791] [<ffffffff8020a8db>] child_rip+0x0/0x15
[ 27.753795]
[ 27.755327]
[ 27.755328] Code: f6 83 88 00 00 00 03 75 37 65 8b 04 25 24 00 00 00 0f a3 05
[ 27.765157] RIP [<ffffffff80225fb0>] profile_tick+0x40/0x90
[ 27.770908] RSP <ffffffff8059ff78>
[ 27.774442] CR2: 0000000000000088
[ 27.777803] <1>Unable to handle kernel NULL pointer dereference at 0000000000000088 RIP:
[ 27.783718] [<ffffffff80225fb0>] profile_tick+0x40/0x90
Going to bisect now... Again, not sure if its related to the irq
codes.
View attachment "conf2618" of type "text/plain" (24847 bytes)
Powered by blists - more mailing lists