lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110428082625.GA23293@pcnci.linuxbox.cz>
Date:	Thu, 28 Apr 2011 10:26:25 +0200
From:	Nikola Ciprich <nikola.ciprich@...uxbox.cz>
To:	linux-kernel mlist <linux-kernel@...r.kernel.org>
Cc:	linux-stable mlist <stable@...nel.org>
Subject: 2.6.32.21 - uptime related crashes?

Hello everybody,

I'm trying to solve strange issue, today, my fourth machine running 2.6.32.21 just crashed. What makes the cases similar, apart fromn same kernel version is that all boxes had very similar uptimes: 214, 216, 216, and 224 days. This might just be a coincidence, but I think this might be important.

Unfortunately I only have backtraces of two crashes (and those are trimmed, sorry), and they do not look as similar as I'd like, but still maybe there is something in common:

[<ffffffff81120cc7>] pollwake+0x57/0x60 
[<ffffffff81046720>] ? default_wake_function+0x0/0x10 
[<ffffffff8103683a>] __wake_up_common+0x5a/0x90 
[<ffffffff8103a313>] __wake_up+0x43/0x70 
[<ffffffffa0321573>] process_masterspan+0x643/0x670 [dahdi] 
[<ffffffffa0326595>] coretimer_func+0x135/0x1d0 [dahdi] 
[<ffffffff8105d74d>] run_timer_softirq+0x15d/0x320 
[<ffffffffa0326460>] ? coretimer_func+0x0/0x1d0 [dahdi] 
[<ffffffff8105690c>] __do_softirq+0xcc/0x220 
[<ffffffff8100c40c>] call_softirq+0x1c/0x30 
[<ffffffff8100e3ba>] do_softirq+0x4a/0x80 
[<ffffffff810567c7>] irq_exit+0x87/0x90 
[<ffffffff8100d7b7>] do_IRQ+0x77/0xf0 
[<ffffffff8100bc53>] ret_from_intr+0x0/Oxa 
<EUI> [<ffffffffa019e556>] ? acpi_idle_enter_bm+0x273/0x2a1 [processor] 
[<ffffffffa019e54c>] ? acpi_idle_enter_bm+0x269/0x2a1 [processor] 
[<ffffffff81280095>] ? cpuidle_idle_call+0xa5/0x150 
[<ffffffff8100a18f>] ? cpu_idle+0x4f/0x90 
[<ffffffff81323c95>] ? rest_init+0x75/0x80 
[<ffffffff81582d7f>] ? start_kernel+0x2ef/0x390 
[<ffffffff81582271>] ? x86_64_start_reservations+0x81/0xc0 
[<ffffffff81582386>] ? x86_64_start_kernel+0xd6/0x100 

this box (actually two of the crashed ones) is using dahdi_dummy module to generate timing for asterisk SW pbx, so maybe it's related to it.


[<ffffffff810a5063>] handle_IRQ_event+0x63/0x1c0
[<ffffffff810a71ae>] handle_edge_irq+0xce/0x160
[<ffffffff8100e1bf>] handle_irq+0x1f/0x30                                                                                                                                              
[<ffffffff8100d7ae>] do_IRQ+0x6e/0xf0
[<ffffffff8100bc53>] ret_from_intr+0x0/Oxa
<EUI> [<ffffffff8133?f?f>] ? _spin_un1ock_irq+0xf/0x40
[<ffffffff81337f79>] ? _spin_un1ock_irq+0x9/0x40
[<ffffffff81064b9a>] ? exit_signals+0x8a/0x130
[<ffffffff8105372e>] ? do_exit+0x7e/0x7d0
[<ffffffff8100f8a7>] ? oops_end+0xa7/0xb0
[<ffffffff8100faa6>] ? die+0x56/0x90
[<ffffffff8100c810>] ? do_trap+0x130/0x150
[<ffffffff8100ccca>] ? do_divide_error+0x8a/0xa0
[<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00
[<ffffffff8104400b>] ? cpuacct_charge+0x6b/0x90
[<ffffffff8100c045>] ? divide_error+0x15/0x20
[<ffffffff8103d227>] ? find_busiest_group+0x3d7/0xa00
[<ffffffff8103cfff>] ? find_busiest_group+0x1af/0xa00
[<ffffffff81335483>] ? thread_return+0x4ce/0x7bb
[<ffffffff8133bec5>] ? do_nanosleep+0x75/0x30
[<ffffffff810?1?4e>] ? hrtimer_nanosleep+0x9e/0x120
[<ffffffff810?08f0>] ? hrtimer_wakeup+0x0/0x30
[<ffffffff810?183f>] ? sys_nanosleep+0x6f/0x80

another two don't use it. only similarity I see here is that it seems to be IRQ handling related, but both issues don't have anything in common.
Does anybody have an idea on where should I look? Of course I should update all those boxes to (at least) latest 2.6.32.x, and I'll do it for sure, but still I'd first like to know where the problem was, and if it has been fixed, or how to fix it...
I'd be gratefull for any help...
BR
nik


-- 
-------------------------------------
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:    +420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: servis@...uxbox.cz
-------------------------------------

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ