lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1221796156.4019.204.camel@sys05.in.vpac.org>
Date:	Fri, 19 Sep 2008 13:49:16 +1000
From:	Brett Pemberton <brett@...c.org>
To:	linux-kernel@...r.kernel.org
Subject: BUG: soft lockup in 2.6.25.5

Hey,

I'm getting about 3-5 machines in a cluster of 95 hanging with 

BUG: soft lockup - CPU#7 stuck for 61s! [pdflush:321]

per week.  Nothing in common each time, different users running
different jobs on different nodes.

The most recent is at the end of this email, .config is attached.

Googling is scary.  Many people reporting these, but never any response.
It's happening on enough separate nodes that I can't believe it's
hardware, although they are identical machines:

- 2x Quad-Core AMD Opteron(tm) Processor 2356
- 32gb ram
- 4 x sata drives

Running CentOS 5.2 with a kernel.org kernel
Has been happening with a variety of kernels from 2.6.25 - present.

I'd love any advice on where to turn to next and what avenues to pursue.
Please CC as am not subscribed.

thanks,

	/ Brett

BUG: soft lockup - CPU#7 stuck for 61s! [pdflush:321]
Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler autofs4 nfs
lockd nfs_acl sunrpc ipv6 ib_ipoib rdma_ucm ib_ucm ib_uverbs
ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa xfs dm_mirror dm_log
dm_multipath dm_mod sbs sbshc battery backlight ac ib_mthca sg
serio_raw button ib_mad ib_core forcedeth usb_storage sata_nv libata
raid0 sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU 7:
Modules linked in: ipmi_devintf ipmi_si ipmi_msghandler autofs4 nfs
lockd nfs_acl sunrpc ipv6 ib_ipoib rdma_ucm ib_ucm ib_uverbs
ib_umad rdma_cm ib_cm iw_cm ib_addr ib_sa xfs dm_mirror dm_log
dm_multipath dm_mod sbs sbshc battery backlight ac ib_mthca sg
serio_raw button ib_mad ib_core forcedeth usb_storage sata_nv libata
raid0 sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 321, comm: pdflush Not tainted 2.6.26.5 #3
RIP: 0010:[<ffffffff80218c45>] [<ffffffff80218c45>]
native_flush_tlb_others+0x61/0x84
RSP: 0000:ffff81041d699b70 EFLAGS: 00000202
RAX: 00000000000008f7 RBX: ffff81042e233600 RCX: 00000000007fec47
RDX: 00000000000008f7 RSI: 00000000000000f7 RDI: 0000000000000020
RBP: ffff81080e887cc0 R08: ffff81041d699b88 R09: 000000000000003a
R10: 0000000000000002 R11: ffff81080858dd80 R12: ffff81041e4562c0
R13: 000000000000003c R14: ffff81081e68c800 R15: ffffffff802ef47a
FS: 00002b16b0eb3630(0000) GS:ffff81081e63a540(0000)
knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000000006c8bc8 CR3: 0000000000201000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400

Call Trace:
[<ffffffff80218d31>] ? flush_tlb_page+0x5f/0x66
[<ffffffff8027255e>] ? page_mkclean+0xfb/0x14d
[<ffffffff80262b61>] ? clear_page_dirty_for_io+0x4e/0xb2
[<ffffffff80262e8f>] ? write_cache_pages+0x165/0x2b7
[<ffffffff80262a2d>] ? __writepage+0x0/0x23
[<ffffffff8026301d>] ? do_writepages+0x20/0x2d
[<ffffffff8029fb25>] ? __writeback_single_inode+0x147/0x277
[<ffffffff802248e3>] ? update_curr+0x5d/0x88
[<ffffffff80226580>] ? dequeue_entity+0x1b/0xed
[<ffffffff8029ffb0>] ? sync_sb_inodes+0x1a1/0x26f
[<ffffffff802a03b4>] ? writeback_inodes+0x62/0xb3
[<ffffffff8026394c>] ? wb_kupdate+0x9e/0x108
[<ffffffff80263cfa>] ? pdflush+0x0/0x1bc
[<ffffffff80263e14>] ? pdflush+0x11a/0x1bc
[<ffffffff802638ae>] ? wb_kupdate+0x0/0x108
[<ffffffff802417c9>] ? kthread+0x47/0x76
[<ffffffff8022be11>] ? schedule_tail+0x28/0x5c
[<ffffffff8020cba8>] ? child_rip+0xa/0x12
[<ffffffff8021d2c3>] ? flat_send_IPI_mask+0x0/0x4c
[<ffffffff80241782>] ? kthread+0x0/0x76
[<ffffffff8020cb9e>] ? child_rip+0x0/0x12

-- 
Brett Pemberton - VPAC Senior Systems Administrator
http://www.vpac.org/ - (03) 9925 4899

View attachment "config" of type "text/plain" (42820 bytes)

Download attachment "signature.asc" of type "application/pgp-signature" (198 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ