lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150408183333.493607a3@goldlack.schiller106>
Date:	Wed, 8 Apr 2015 18:33:33 +0200
From:	Torsten Luettgert <ml-lkml@...a.eu>
To:	linux-kernel@...r.kernel.org
Cc:	Christoph Hellwig <hch@....de>
Subject: BUG: unable to handle kernel NULL pointer deref, bisected to
 746650160

Hello,

I'm getting NULL pointer deref BUGs on a Supermicro machine of
mine since 3.17. It occurs at random uptimes, often a few hours
after booting (max uptime was 2 days yet).

I bisected the problem (took a while); the problematic commit seems
to be 746650160866 (scsi: convert host_busy to atomic_t) by
Christoph Hellwig.

Here's one of the logs (it's always the same trace):

BUG: unable to handle kernel NULL pointer dereference at
0000000000000010 IP: [<ffffffff8133af60>]
swiotlb_unmap_sg_attrs+0x30/0x80 PGD 0 
Oops: 0000 [#1] SMP 
Modules linked in: iTCO_wdt iTCO_vendor_support lpc_ich mfd_core
usb_storage CPU: 0 PID: 0 Comm: swapper/0 Not tainted
3.16.0-74665016086615bb+ #1 Hardware name: Supermicro X8DTT/X8DTT, BIOS
080016  10/05/2010 task: ffffffff81c16480 ti: ffffffff81c00000 task.ti:
ffffffff81c00000 RIP: 0010:[<ffffffff8133af60>]  [<ffffffff8133af60>]
swiotlb_unmap_sg_attrs+0x30/0x80 RSP: 0018:ffff88063fc03e08  EFLAGS:
00010002 RAX: 0000000000000000 RBX: 0000000000000001 RCX:
0000000000000002 RDX: 0000000000000000 RSI: 000000090e2ef000 RDI:
ffff880c14e61a00 RBP: ffff88063fc03e38 R08: 0000000000000000 R09:
ffff8806209cc098 R10: ffff88063f400120 R11: 0000000000001268 R12:
0000000000000002 R13: 0000000000000002 R14: ffff8806209cc098 R15:
ffff880c200fcc70 FS:  0000000000000000(0000) GS:ffff88063fc00000(0000)
knlGS:0000000000000000 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000010 CR3: 0000000001c11000 CR4: 00000000000027e0
Stack:
 0000000000000094 0000000000000094 ffff880c200f8718 0000000000000094
 0000000000000094 0000000000000094 ffff88063fc03e48 ffffffff8146a0b4
 ffff88063fc03e88 ffffffff81477c1d ffff88063fc03e78 ffff880c213a57c0
Call Trace:
 <IRQ> 
 [<ffffffff8146a0b4>] scsi_dma_unmap+0x54/0x70
 [<ffffffff81477c1d>] twl_interrupt+0x26d/0x420
 [<ffffffff810fe2fd>] handle_irq_event_percpu+0x5d/0x1c0
 [<ffffffff810fe4a2>] handle_irq_event+0x42/0x70
 [<ffffffff8110165b>] handle_fasteoi_irq+0x5b/0x100
 [<ffffffff81053fdc>] handle_irq+0x5c/0x150
 [<ffffffff810c8f72>] ? __atomic_notifier_call_chain+0x12/0x20
 [<ffffffff810c8f96>] ? atomic_notifier_call_chain+0x16/0x20
 [<ffffffff81776f6e>] do_IRQ+0x5e/0x110
 [<ffffffff817754ea>] common_interrupt+0x6a/0x6a
 <EOI> 
 [<ffffffff815de8c3>] ? cpuidle_enter_state+0x53/0xd0
 [<ffffffff815de8bf>] ? cpuidle_enter_state+0x4f/0xd0
 [<ffffffff815de957>] cpuidle_enter+0x17/0x20
 [<ffffffff810e95a4>] cpuidle_idle_call+0xc4/0x250
 [<ffffffff810e9855>] cpu_idle_loop+0x125/0x1d0
 [<ffffffff810e9913>] cpu_startup_entry+0x13/0x20
 [<ffffffff81769597>] rest_init+0x77/0x80
 [<ffffffff81d74344>] start_kernel+0x39a/0x3a1
 [<ffffffff81d73dc8>] ? set_init_arg+0x5d/0x5d
 [<ffffffff8176f1ad>] ? memblock_reserve+0x4c/0x51
 [<ffffffff81d735ad>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff81d736f0>] x86_64_start_kernel+0x141/0x148
Code: 56 49 89 fe 41 55 41 89 cd 41 54 41 89 d4 53 48 83 ec 10 83 f9 03
74 5e 31 db 85 d2 48 89 f0 7e 48 66 2e 0f 1f 84 00 00 00 00 00 <48> 8b
70 10 48 3b 35 d5 16 e0 00 8b 50 18 72 1e 48 3b 35 d1 16 RIP
[<ffffffff8133af60>] swiotlb_unmap_sg_attrs+0x30/0x80 RSP
<ffff88063fc03e08> CR2: 0000000000000010 ---[ end trace
4e21be7f8b16aadd ]---

The same problem was reported by Kui Zhang last october with the
subject "3.17.0-rc7 kernel NULL pointer dereference (3ware 9650SE)".
Regrettably (for me), nobody replied.

We have a 3ware controller, too, but ours is a 9750. Controller
firmware and BIOS are current.

Any help with this is greatly appreciated.

Regards,
Torsten
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ