lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 6 Oct 2006 12:00:39 -0700
From:	Andrew Vasquez <andrew.vasquez@...gic.com>
To:	Muli Ben-Yehuda <muli@...ibm.com>
Cc:	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Rajesh Shah <rajesh.shah@...el.com>, Andi Kleen <ak@....de>,
	"Protasevich, Natalie" <Natalie.Protasevich@...SYS.com>,
	"Luck, Tony" <tony.luck@...el.com>, Andrew Morton <akpm@...l.org>,
	Linus Torvalds <torvalds@...l.org>,
	Linux-Kernel <linux-kernel@...r.kernel.org>,
	Badari Pulavarty <pbadari@...il.com>
Subject: Re: 2.6.19-rc1 genirq causes either boot hang or "do_IRQ: cannot handle IRQ -1"

On Fri, 06 Oct 2006, Muli Ben-Yehuda wrote:

> On Fri, Oct 06, 2006 at 05:50:21PM +0200, Muli Ben-Yehuda wrote:
> 
> > > What happens if you boot with max_cpus=1?
> > 
> > Trying it now... woohoo, it boots all the way and stays up!
> 
> Ok, after verifying that maxcpus=1 causes the problematic changeset to
> boot, I also tried maxcpus=1 with the tip of the tree. I hit this NULL
> pointer dereference in profile_tick, with and without
> maxcpus=1. Disassembly says that get_irq_regs() is returning NULL,
> which may or may not be related to the genirq issue.
> 
> kernel (hd0,1)/boot/calgary/bzImage root=/dev/sda2 console=tty0 console=ttyS1,1 9200 maxcpus=1
>    [Linux-bzImage, setup=0x1c00, size=0x2e44df]
> initrd (hd0,1)/boot/calgary/aic94xxfw.initramfs.gz
>    [Linux-initrd @ 0x37e3f000, 0x1b0188 bytes] savedefault
>                                                                                 
> [    0.000000] Linux version 2.6.19-rc1mx (muli@...n) (gcc version 3.4.1) #154 S MP Fri Oct 6 17:57:51 IST 2006
> [    0.000000] Command line: root=/dev/sda2 console=tty0 console=ttyS1,19200 max cpus=1
> [    0.000000] BIOS-provided physical RAM map:
...
> [  169.111284] Memory: 6096436k/6684672k available (3788k kernel code, 193708k r eserved, 2727k data, 276k init)
> [  169.249201] Calibrating delay using timer specific routine.. 6346.40 BogoMIPS  (lpj=12692802)
> [  169.300193] Mount-cache hash table entries: 256
> [  169.329043] CPU: Trace cache: 12K uops, L1 D cache: 16K
> [  169.360565] CPU: L2 cache: 1024K
> [  169.379968] using mwait in idle threads.
> [  169.403574] CPU: Physical Processor ID: 0
> [  169.427697] CPU: Processor Core ID: 0
> [  169.449745] CPU0: Thermal monitoring enabled (TM1)
> [  169.478556] Freeing SMP alternatives: 32k freed
> [  169.505811] ACPI: Core revision 20060707
> [  169.576566] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
> [  169.651600] Using local APIC timer interrupts.
> [  169.709847] result 10425453
> [  169.726643] Detected 10.425 MHz APIC timer.
> [  169.753344] Brought up 1 CPUs
> [  169.771342] Unable to handle kernel NULL pointer dereference at 0000000000000 088 RIP:
> [  169.804240]  [<ffffffff8022de57>] profile_tick+0x34/0x6a
> [  169.851061] PGD 0
> [  169.863259] Oops: 0000 [1] SMP
> [  169.882391] CPU 0
> [  169.894607] Modules linked in:
> [  169.913117] Pid: 1, comm: swapper Not tainted 2.6.19-rc1mx #154
> [  169.948655] RIP: 0010:[<ffffffff8022de57>]  [<ffffffff8022de57>] profile_tick +0x34/0x6a
> [  169.996876] RSP: 0000:ffffffff808d8f78  EFLAGS: 00010046
> [  170.028766] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> [  170.071615] RDX: ffff8100893f5f00 RSI: 0000000000000000 RDI: 0000000000000001
> [  170.114451] RBP: ffffffff808d8f88 R08: 0000000000000002 R09: ffffffff8022d24a
> [  170.157290] R10: ffffffff8022d24a R11: ffffffff80732780 R12: 0000000000000001
> [  170.200134] R13: ffffffff808e8d75 R14: 0000000000000246 R15: 0000000000000012
> [  170.242975] FS:  0000000000000000(0000) GS:ffffffff8085e000(0000) knlGS:00000 00000000000
> [  170.291576] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
> [  170.326102] CR2: 0000000000000088 CR3: 0000000000201000 CR4: 00000000000006e0
> [  170.368947] Process swapper (pid: 1, threadinfo ffff810197c7c000, task ffff81 0197c67040)
> [  170.417545] Stack:  ffffffff807327c0 0000000000000012 ffffffff808d8f98 ffffff ff8021507e
> [  170.466126] upt+0xe/0x54
> [  170.616246]  [<ffffffff80215108>] smp_apic_timer_interrupt+0x44/0x4b
> [  170.654411]  [<ffffffff8020a4fb>] apic_timer_interrupt+0x6b/0x70
> [  170.690486]  <EOI>  [<ffffffff8022d24a>] release_console_sem+0x47/0x200
> [  170.730351]  [<ffffffff8022d24a>] release_console_sem+0x47/0x200
> [  18a65>] atomic_notifier_chain_register+0x33/0x3e
> [  170.937284]  [<ffffffff808a3359>] spawn_softlockup_task+0x6a/0x6f
> [  170.973876]  [<ffffffff80207116>] init+0xce/0x30c
> [  171.002152]  [<ffffffff805afd00>] trace_hardirqs_on_thunk+0x35/0x37
> [  171.039796]  [<ffffffff80244952>] trace_hardirqs_on+0xf6/0x11a
> [  171.074833]  [<ffffffff8020a6e5>] child_rip+0xa/0x15
> [  171.104669]  [<ffffffff805b0484>] _spin_unlock_irq+0x29/0x2f
> [  171.138667]  [<ffffffff80209e5d>] restore_args+0x0/0x30
> [  171.170065]  [<ffffffff80207048>] init+0x0/0x30c
> [  171.197821]  [<ffffffff8020a6db>] child_rip+0x0/0x15

Hmm, I'm seeing a similar boot-up panic on my x86_64 box.  Here's the
boot-output (.config is attached).

[    0.000000] Linux version 2.6.19-rc1 (root@spe) (gcc version 4.1.0 (SUSE Linux)) #5 SMP Fri Oct 6 09:49:14 PDT 2006
[    0.000000] Command line: root=/dev/sda2 vga=1 resume=/dev/sda1 console=ttyS0,115200 console=tty0 nmi_watchdog=1
[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009b800 (usable)
[    0.000000]  BIOS-e820: 000000000009b800 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000d0000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 000000003ff10000 (usable)
[    0.000000]  BIOS-e820: 000000003ff10000 - 000000003ff17000 (ACPI data)
[    0.000000]  BIOS-e820: 000000003ff17000 - 000000003ff80000 (ACPI NVS)
[    0.000000]  BIOS-e820: 000000003ff80000 - 0000000040000000 (reserved)
[    0.000000]  BIOS-e820: 00000000e0000000 - 00000000e8000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec00400 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[    0.000000]  BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
[    0.000000] end_pfn_map = 1048576
[    0.000000] DMI present.
[    0.000000] Zone PFN ranges:
[    0.000000]   DMA             0 ->     4096
[    0.000000]   DMA32        4096 ->  1048576
[    0.000000]   Normal    1048576 ->  1048576
[    0.000000] early_node_map[2] active PFN ranges
[    0.000000]     0:        0 ->      155
[    0.000000]     0:      256 ->   261904
[    0.000000] Nvidia board detected. Ignoring ACPI timer override.
[    0.000000] ACPI: PM-Timer IO Port: 0x8008
[    0.000000] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
[    0.000000] Processor #0 (Bootup-CPU)
[    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
[    0.000000] Processor #1
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
[    0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
[    0.000000] IOAPIC[0]: apic_id 2, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: IOAPIC (id[0x03] address[0xdf300000] gsi_base[24])
[    0.000000] IOAPIC[1]: apic_id 3, address 0xdf300000, GSI 24-27
[    0.000000] ACPI: IOAPIC (id[0x04] address[0xdf301000] gsi_base[28])
[    0.000000] IOAPIC[2]: apic_id 4, address 0xdf301000, GSI 28-31
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge)
[    0.000000] ACPI: BIOS IRQ0 pin2 override ignored.
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[    0.000000] Setting APIC routing to flat
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] Nosave address range: 000000000009b000 - 000000000009c000
[    0.000000] Nosave address range: 000000000009c000 - 00000000000a0000
[    0.000000] Nosave address range: 00000000000a0000 - 00000000000d0000
[    0.000000] Nosave address range: 00000000000d0000 - 0000000000100000
[    0.000000] Allocating PCI resources starting at 50000000 (gap: 40000000:a0000000)
[    0.000000] PERCPU: Allocating 32512 bytes of per cpu data
[    0.000000] Built 1 zonelists.  Total pages: 256711
[    0.000000] Kernel command line: root=/dev/sda2 vga=1 resume=/dev/sda1 console=ttyS0,115200 console=tty0 nmi_watchdog=1
[    0.000000] Initializing CPU#0
[    0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[   26.923435] Console: colour VGA+ 80x50
[   27.230350] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes)
[   27.238291] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes)
[   27.245305] Checking aperture...
[   27.248584] CPU 0: aperture @ 0 size 32 MB
[   27.255250] No AGP bridge found
[   27.276817] Memory: 1024988k/1047616k available (2405k kernel code, 22060k reserved, 955k data, 224k init)
[   27.366087] Calibrating delay using timer specific routine.. 4423.17 BogoMIPS (lpj=8846344)
[   27.374753] Mount-cache hash table entries: 256
[   27.379624] CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
[   27.386804] CPU: L2 Cache: 1024K (64 bytes/line)
[   27.391486] Freeing SMP alternatives: 24k freed
[   27.396100] ACPI: Core revision 20060707
[   27.445859] activating NMI Watchdog ... done.
[   27.450314] Using local APIC timer interrupts.
[   27.500022] result 12557931
[   27.502863] Detected 12.557 MHz APIC timer.
[   27.510539] Booting processor 1/2 APIC 0x1
[   27.514684] Unable to handle kernel NULL pointer dereference at 0000000000000088 RIP: 
[   27.520204]  [<ffffffff80225fb0>] profile_tick+0x40/0x90
[   27.528118] PGD 0 
[   27.530222] Oops: 0000 [1] SMP 
[   27.533505] CPU 0 
[   27.535610] Modules linked in:
[   27.538755] Pid: 1, comm: swapper Not tainted 2.6.19-rc1 #5
[   27.544367] RIP: 0010:[<ffffffff80225fb0>]  [<ffffffff80225fb0>] profile_tick+0x40/0x90
[   27.552483] RSP: 0000:ffffffff8059ff78  EFLAGS: 00010046
[   27.557842] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000001
[   27.565024] RDX: ffff810081a77f40 RSI: 0000000000000046 RDI: 0000000000000001
[   27.572203] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000007
[   27.579383] R10: 0000000000000002 R11: ffffffff8032c8a0 R12: ffffffff805ae6a1
[   27.586563] R13: 0000000000000012 R14: 0000000000000031 R15: 0000000000000246
[   27.593744] FS:  0000000000000000(0000) GS:ffffffff80549000(0000) knlGS:0000000000000000
[   27.601892] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[   27.607678] CR2: 0000000000000088 CR3: 0000000000201000 CR4: 00000000000006e0
[   27.614859] Process swapper (pid: 1, threadinfo ffff8100021fe000, task ffff8100021ee740)
[   27.623008] Stack:  ffffffff8059a6e0 ffffffff804e2700 ffff8100021ffb30 ffffffff80215c7e
[   27.631313]  0000000000bf9e6b ffffffff802161b5 0000000000000000 ffffffff8020a6f6
[   27.638953]  ffff8100021ffb30 <EOI>  0000000000000000 ffffffff8032c8a0 0000000000000002
[   27.647016] Call Trace:
[   27.649745]  <IRQ>  [<ffffffff80215c7e>] smp_local_timer_interrupt+0xe/0x60
[   27.656803]  [<ffffffff802161b5>] smp_apic_timer_interrupt+0x35/0x40
[   27.663205]  [<ffffffff8020a6f6>] apic_timer_interrupt+0x66/0x70
[   27.669256]  <EOI>  [<ffffffff8032c8a0>] vgacon_cursor+0x0/0x1c8
[   27.675364]  [<ffffffff802252fd>] vprintk+0x2fd/0x350
[   27.680462]  [<ffffffff805755a5>] init_idle+0x95/0xb0
[   27.685558]  [<ffffffff8022539e>] printk+0x4e/0x60
[   27.690391]  [<ffffffff8021ce7f>] complete+0x3f/0x60
[   27.695396]  [<ffffffff8056ed8f>] __cpu_up+0x40f/0x7d0
[   27.700575]  [<ffffffff80215a70>] do_fork_idle+0x0/0x20
[   27.705843]  [<ffffffff804573cf>] __mutex_lock_slowpath+0x1df/0x1f0
[   27.712157]  [<ffffffff802407f2>] cpu_up+0xa2/0x120
[   27.717082]  [<ffffffff802070bb>] init+0x9b/0x330
[   27.721830]  [<ffffffff804585d9>] _spin_unlock_irq+0x9/0x10
[   27.727451]  [<ffffffff802209bc>] schedule_tail+0x4c/0xc0
[   27.732897]  [<ffffffff8020a8e5>] child_rip+0xa/0x15
[   27.737906]  [<ffffffff8033171e>] acpi_ds_init_one_object+0x0/0x82
[   27.744131]  [<ffffffff80207020>] init+0x0/0x330
[   27.748791]  [<ffffffff8020a8db>] child_rip+0x0/0x15
[   27.753795] 
[   27.755327] 
[   27.755328] Code: f6 83 88 00 00 00 03 75 37 65 8b 04 25 24 00 00 00 0f a3 05 
[   27.765157] RIP  [<ffffffff80225fb0>] profile_tick+0x40/0x90
[   27.770908]  RSP <ffffffff8059ff78>
[   27.774442] CR2: 0000000000000088
[   27.777803]  <1>Unable to handle kernel NULL pointer dereference at 0000000000000088 RIP: 
[   27.783718]  [<ffffffff80225fb0>] profile_tick+0x40/0x90

Going to bisect now...  Again, not sure if its related to the irq
codes.

View attachment "conf2618" of type "text/plain" (24847 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ