lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <E109A827-1A6B-40AB-A0DE-CB713793E114@its.uq.edu.au>
Date:	Mon, 16 Jun 2008 09:07:09 +1000
From:	Christian Unger <c.unger@....uq.edu.au>
To:	linux-kernel@...r.kernel.org
Subject: kernel:<3>Bug: soft lockup - CPU#3 stuck for 10s! [bond1:<pid>]

Hello there

I'm getting two seemingly related issues. They are both reported as a  
soft lockup on CPU#3 ... Initially i didn't worry about it, because  
the two hosts affected by this were having other issues, but now that  
said other issues are resolved it's gotten worse (but most likely only  
because the cluster is failing for other reasons). On the off chance  
that the previous issue (which also involved bonding) is relevant i'll  
outline this also:

System configuration etc:
CentOS 5.2 current with Cluster Suite. I'm running a stack of RHEL  
systems so i'm updating all of them from the same repos (so there is  
the first ugliness). The systems are Dell PowerEdge 1950s with on  
board broadcom's and expansion intel NICs.

`uname -r` gives: 2.6.18-92.1.1.el5 (which is the Red Hat kernel  
package current today, but it has been happening on various kernel  
versions for month).

The error message indicates that the bnx2 module does not taint the  
kernel, and overall the kernel is not tainted:
`cat /proc/sys/kernel/tainted` gives:
0

The `lspci` is as follows:
00:00.0 Host bridge: Intel Corporation 5000X Chipset Memory Controller  
Hub (rev 12)
00:02.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express  
x4 Port 2 (rev 12)
00:03.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express  
x4 Port 3 (rev 12)
00:04.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express  
x8 Port 4-5 (rev 12)
00:05.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express  
x4 Port 5 (rev 12)
00:06.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express  
x8 Port 6-7 (rev 12)
00:07.0 PCI bridge: Intel Corporation 5000 Series Chipset PCI Express  
x4 Port 7 (rev 12)
00:10.0 Host bridge: Intel Corporation 5000 Series Chipset FSB  
Registers (rev 12)
00:10.1 Host bridge: Intel Corporation 5000 Series Chipset FSB  
Registers (rev 12)
00:10.2 Host bridge: Intel Corporation 5000 Series Chipset FSB  
Registers (rev 12)
00:11.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved  
Registers (rev 12)
00:13.0 Host bridge: Intel Corporation 5000 Series Chipset Reserved  
Registers (rev 12)
00:15.0 Host bridge: Intel Corporation 5000 Series Chipset FBD  
Registers (rev 12)
00:16.0 Host bridge: Intel Corporation 5000 Series Chipset FBD  
Registers (rev 12)
00:1c.0 PCI bridge: Intel Corporation 631xESB/632xESB/3100 Chipset PCI  
Express Root Port 1 (rev 09)
00:1d.0 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset  
UHCI USB Controller #1 (rev 09)
00:1d.1 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset  
UHCI USB Controller #2 (rev 09)
00:1d.2 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset  
UHCI USB Controller #3 (rev 09)
00:1d.7 USB Controller: Intel Corporation 631xESB/632xESB/3100 Chipset  
EHCI USB2 Controller (rev 09)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d9)
00:1f.0 ISA bridge: Intel Corporation 631xESB/632xESB/3100 Chipset LPC  
Interface Controller (rev 09)
00:1f.1 IDE interface: Intel Corporation 631xESB/632xESB IDE  
Controller (rev 09)
01:00.0 PCI bridge: Intel Corporation 80333 Segment-A PCI Express-to- 
PCI Express Bridge
01:00.2 PCI bridge: Intel Corporation 80333 Segment-B PCI Express-to- 
PCI Express Bridge
02:0e.0 RAID bus controller: Dell PowerEdge Expandable RAID controller 5
04:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev c3)
05:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708  
Gigabit Ethernet (rev 12)
06:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express  
Upstream Port (rev 01)
06:00.3 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express to  
PCI-X Bridge (rev 01)
07:00.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express  
Downstream Port E1 (rev 01)
07:01.0 PCI bridge: Intel Corporation 6311ESB/6321ESB PCI Express  
Downstream Port E2 (rev 01)
08:00.0 PCI bridge: Broadcom EPB PCI-Express to PCI-X Bridge (rev c3)
09:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708  
Gigabit Ethernet (rev 12)
0c:00.0 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to  
PCI Express HBA (rev 02)
0c:00.1 Fibre Channel: QLogic Corp. ISP2432-based 4Gb Fibre Channel to  
PCI Express HBA (rev 02)
0e:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit  
Ethernet Controller (rev 06)
0e:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit  
Ethernet Controller (rev 06)
10:0d.0 VGA compatible controller: ATI Technologies Inc ES1000 (rev 02)



__Original Problem__

The two cluster nodes could not talk properly, certain services that  
are part of the cluster suite set of programs (especially rgmanager)  
were failing to run, and everything was a general mess. The fix came  
with switching bonding mode from mode=0 to mode=1 (/etc/modprobe.conf) :

alias eth0 bnx2
alias eth1 bnx2
alias eth2 e1000
alias eth3 e1000
alias bond0 bonding
options bond0 mode=1 miimon=100 use_carrier=0
alias bond1 bonding
options bond1 mode=1 miimon=100 use_carrier=0


__New Problem__

As off about two weeks ago i've started putting a bit of load across  
the active node, and found that after about 3-4 days at least one node  
will fail. The errors i get are in two forms

The short one which just has a soft lockup, and the long kind that  
includes an oom_killer message.

First the short kind:

2008-06-15T04:25:58.134710+10:00 fiction kernel:<3>BUG: soft lockup -  
CPU#3 stuck for 10s! [bond1:3095]
2008-06-15T04:25:58.134718+10:00 fiction kernel:<4>CPU 3:
2008-06-15T04:25:58.134923+10:00 fiction kernel:<4>Modules linked in:  
nfsd exportfs lockd nfs_acl auth_rpcgss autofs4 lock_dlm gfs2 dlm  
configfs sunrpc bonding ip_conntrack_netbios_ns ipt_REJECT ipt_LOG  
xt_limit xt_tcpudp xt_state ip_c
onntrack nfnetlink iptable_filter ip_tables ip6_tables x_tables  
dm_round_robin dm_multipath video sbs backlight i2c_ec i2c_core button  
battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev  
ide_cd shpchp sr_mod e1000e cdrom
i5000_edac bnx2 edac_mc pcspkr serio_raw sg dm_snapshot dm_zero  
dm_mirror dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix libata  
megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
2008-06-15T04:25:58.134927+10:00 fiction kernel:<4>Pid: 3095, comm:  
bond1 Not tainted 2.6.18-92.1.1.el5 #1
2008-06-15T04:25:58.135186+10:00 fiction kernel:<4>RIP: 0010: 
[.text.lock.spinlock+41/48]  [.text.lock.spinlock 
+41/48] .text.lock.spinlock+0x29/0x30
2008-06-15T04:25:58.135191+10:00 fiction kernel:<4>RSP:  
0018:ffff81012c7cdd20  EFLAGS: 00000286
2008-06-15T04:25:58.135195+10:00 fiction kernel:<4>RAX:  
ffff81012c7cdfd8 RBX: ffff81012b138000 RCX: ffff81012c7cdd80
2008-06-15T04:25:58.135199+10:00 fiction kernel:<4>RDX:  
0000000000008948 RSI: ffff81012c7cdd70 RDI: ffff81012b138714
2008-06-15T04:25:58.135203+10:00 fiction kernel:<4>RBP:  
ffff81012d921a20 R08: ffff81012c7cdd50 R09: 000000000000003d
2008-06-15T04:25:58.135207+10:00 fiction kernel:<4>R10:  
ffff81012fc5c008 R11: 0000000000000003 R12: 000000000000000c
2008-06-15T04:25:58.135212+10:00 fiction kernel:<4>R13:  
ffff81012c7cdd70 R14: ffff81012b138000 R15: ffff81012fa64100
2008-06-15T04:25:58.135216+10:00 fiction kernel:<4>FS:   
0000000000000000(0000) GS:ffff81010439c640(0000) knlGS:0000000000000000
2008-06-15T04:25:58.135221+10:00 fiction kernel:<4>CS:  0010 DS: 0018  
ES: 0018 CR0: 000000008005003b
2008-06-15T04:25:58.135225+10:00 fiction kernel:<4>CR2:  
0000000011a4e388 CR3: 0000000000201000 CR4: 00000000000006e0
2008-06-15T04:25:58.135228+10:00 fiction kernel:<4>
2008-06-15T04:25:58.135232+10:00 fiction kernel:<4>Call Trace:
2008-06-15T04:25:58.135457+10:00 fiction kernel:<4> [bnx2:bnx2_ioctl 
+105/255] :bnx2:bnx2_ioctl+0x69/0xff
2008-06-15T04:25:58.135574+10:00 fiction kernel:<4>  
[bonding:bond_check_dev_link+211/441] :bonding:bond_check_dev_link 
+0xd3/0x1b9
2008-06-15T04:25:58.135596+10:00 fiction kernel:<4> [thread_return 
+0/223] thread_return+0x0/0xdf
2008-06-15T04:25:58.135707+10:00 fiction kernel:<4>  
[bonding:__bond_mii_monitor+136/1092] :bonding:__bond_mii_monitor 
+0x88/0x444
2008-06-15T04:25:58.135815+10:00 fiction kernel:<4>  
[bonding:bond_mii_monitor+0/140] :bonding:bond_mii_monitor+0x0/0x8c
2008-06-15T04:25:58.135922+10:00 fiction kernel:<4>  
[bonding:bond_mii_monitor+45/140] :bonding:bond_mii_monitor+0x2d/0x8c
2008-06-15T04:25:58.135981+10:00 fiction kernel:<4> [run_workqueue 
+148/228] run_workqueue+0x94/0xe4
2008-06-15T04:25:58.135999+10:00 fiction kernel:<4> [worker_thread 
+0/290] worker_thread+0x0/0x122
2008-06-15T04:25:58.136025+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T04:25:58.136042+10:00 fiction kernel:<4> [worker_thread 
+240/290] worker_thread+0xf0/0x122
2008-06-15T04:25:58.136064+10:00 fiction kernel:<4>  
[<ffffffff8008ad26>] default_wake_function+0x0/0xe
2008-06-15T04:25:58.136088+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T04:25:58.136117+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T04:25:58.136133+10:00 fiction kernel:<4> [kthread+254/306]  
kthread+0xfe/0x132
2008-06-15T04:25:58.136151+10:00 fiction kernel:<4> [child_rip+10/17]  
child_rip+0xa/0x11
2008-06-15T04:25:58.136176+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T04:25:58.136191+10:00 fiction kernel:<4> [kthread+0/306]  
kthread+0x0/0x132
2008-06-15T04:25:58.136209+10:00 fiction kernel:<4> [child_rip+0/17]  
child_rip+0x0/0x11
2008-06-15T04:25:58.136214+10:00 fiction kernel:<4>


And now a longer one:

2008-06-15T05:04:38.294604+10:00 fiction kernel:<3>BUG: soft lockup -  
CPU#3 stuck for 10s! [bond1:3095]
2008-06-15T05:04:38.294947+10:00 fiction kernel:<4>CPU 3:
2008-06-15T05:04:38.304072+10:00 fiction kernel:<4>Modules linked in:  
nfsd exportfs lockd nfs_acl auth_rpcgss autofs4 lock_dlm gfs2 dlm  
configfs sunrpc bonding ip_conntrack_netbios_ns ipt_REJECT ipt_LOG  
xt_limit xt_tcpudp xt_state ip_c
onntrack nfnetlink iptable_filter ip_tables ip6_tables x_tables  
dm_round_robin dm_multipath video sbs backlight i2c_ec i2c_core button  
battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev  
ide_cd shpchp sr_mod e1000e cdrom
i5000_edac bnx2 edac_mc pcspkr serio_raw sg dm_snapshot dm_zero  
dm_mirror dm_mod usb_storage qla2xxx scsi_transport_fc ata_piix libata  
megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
2008-06-15T05:04:38.304198+10:00 fiction kernel:<4>Pid: 3095, comm:  
bond1 Not tainted 2.6.18-92.1.1.el5 #1
2008-06-15T05:04:38.307077+10:00 fiction kernel:<4>RIP: 0010: 
[.text.lock.spinlock+38/48]  [.text.lock.spinlock 
+38/48] .text.lock.spinlock+0x26/0x30
2008-06-15T05:04:38.307089+10:00 fiction kernel:<4>RSP:  
0018:ffff81012c7cdd20  EFLAGS: 00000286
2008-06-15T05:04:38.307094+10:00 fiction kernel:<4>RAX:  
ffff81012c7cdfd8 RBX: ffff81012b138000 RCX: ffff81012c7cdd80
2008-06-15T05:04:38.307098+10:00 fiction kernel:<4>RDX:  
0000000000008948 RSI: ffff81012c7cdd70 RDI: ffff81012b138714
2008-06-15T05:04:38.307102+10:00 fiction kernel:<4>RBP:  
ffff81012d921a20 R08: ffff81012c7cdd50 R09: 000000000000003d
2008-06-15T05:04:38.307107+10:00 fiction kernel:<4>R10:  
ffff81012fc5c008 R11: 0000000000000003 R12: 000000000000000c
2008-06-15T05:04:38.307111+10:00 fiction kernel:<4>R13:  
ffff81012c7cdd70 R14: ffff81012b138000 R15: ffff81012fa64100
2008-06-15T05:04:38.307115+10:00 fiction kernel:<4>FS:   
0000000000000000(0000) GS:ffff81010439c640(0000) knlGS:0000000000000000
2008-06-15T05:04:38.307119+10:00 fiction kernel:<4>CS:  0010 DS: 0018  
ES: 0018 CR0: 000000008005003b
2008-06-15T05:04:38.307123+10:00 fiction kernel:<4>CR2:  
0000000011a4e388 CR3: 0000000000201000 CR4: 00000000000006e0
2008-06-15T05:04:38.307126+10:00 fiction kernel:<4>
2008-06-15T05:04:38.307130+10:00 fiction kernel:<4>Call Trace:
2008-06-15T05:04:38.310086+10:00 fiction kernel:<4> [bnx2:bnx2_ioctl 
+105/255] :bnx2:bnx2_ioctl+0x69/0xff
2008-06-15T05:04:38.310229+10:00 fiction kernel:<4>  
[bonding:bond_check_dev_link+211/441] :bonding:bond_check_dev_link 
+0xd3/0x1b9
2008-06-15T05:04:38.310260+10:00 fiction kernel:<4> [thread_return 
+0/223] thread_return+0x0/0xdf
2008-06-15T05:04:38.310369+10:00 fiction kernel:<4>  
[bonding:__bond_mii_monitor+136/1092] :bonding:__bond_mii_monitor 
+0x88/0x444
2008-06-15T05:04:38.310476+10:00 fiction kernel:<4>  
[bonding:bond_mii_monitor+0/140] :bonding:bond_mii_monitor+0x0/0x8c
2008-06-15T05:04:38.310583+10:00 fiction kernel:<4>  
[bonding:bond_mii_monitor+45/140] :bonding:bond_mii_monitor+0x2d/0x8c
2008-06-15T05:04:38.310725+10:00 fiction kernel:<4> [run_workqueue 
+148/228] run_workqueue+0x94/0xe4
2008-06-15T05:04:38.310749+10:00 fiction kernel:<4> [worker_thread 
+0/290] worker_thread+0x0/0x122
2008-06-15T05:04:38.310778+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T05:04:38.310795+10:00 fiction kernel:<4> [worker_thread 
+240/290] worker_thread+0xf0/0x122
2008-06-15T05:04:38.310820+10:00 fiction kernel:<4>  
[<ffffffff8008ad26>] default_wake_function+0x0/0xe
2008-06-15T05:04:38.310845+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T05:04:38.310869+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T05:04:38.310890+10:00 fiction kernel:<4> [kthread+254/306]  
kthread+0xfe/0x132
2008-06-15T05:04:38.310911+10:00 fiction kernel:<4> [child_rip+10/17]  
child_rip+0xa/0x11
2008-06-15T05:04:38.310936+10:00 fiction kernel:<4>  
[keventd_create_kthread+0/196] keventd_create_kthread+0x0/0xc4
2008-06-15T05:04:38.310954+10:00 fiction kernel:<4> [kthread+0/306]  
kthread+0x0/0x132
2008-06-15T05:04:38.310972+10:00 fiction kernel:<4> [child_rip+0/17]  
child_rip+0x0/0x11
2008-06-15T05:04:38.310975+10:00 fiction kernel:<4>
2008-06-15T05:04:46.176360+10:00 fiction kernel:<4>ip.sh invoked oom- 
killer: gfp_mask=0xd0, order=1, oomkilladj=0
2008-06-15T05:04:46.176688+10:00 fiction kernel:<4>
2008-06-15T05:04:46.187169+10:00 fiction kernel:<4>Call Trace:
2008-06-15T05:04:46.190577+10:00 fiction kernel:<4> [out_of_memory 
+142/757] out_of_memory+0x8e/0x2f5
2008-06-15T05:04:46.190969+10:00 fiction kernel:<4>  
[<ffffffff8009df05>] autoremove_wake_function+0x0/0x2e
2008-06-15T05:04:46.191630+10:00 fiction kernel:<4> [__alloc_pages 
+581/718] __alloc_pages+0x245/0x2ce
2008-06-15T05:04:46.192083+10:00 fiction kernel:<4>  
[ip_conntrack:__get_free_pages+14/11954] __get_free_pages+0xe/0x71
2008-06-15T05:04:46.192103+10:00 fiction kernel:<4> [copy_process 
+198/5445] copy_process+0xc6/0x1545
2008-06-15T05:04:46.192430+10:00 fiction kernel:<4> [alloc_pid 
+494/650] alloc_pid+0x1ee/0x28a
2008-06-15T05:04:46.192455+10:00 fiction kernel:<4> [do_fork+104/391]  
do_fork+0x68/0x187
2008-06-15T05:04:46.193414+10:00 fiction kernel:<4> [tracesys+213/224]  
tracesys+0xd5/0xe0
2008-06-15T05:04:46.193438+10:00 fiction kernel:<4> [ptregscall_common 
+103/172] ptregscall_common+0x67/0xac
2008-06-15T05:04:46.193442+10:00 fiction kernel:<4>
2008-06-15T05:04:46.193446+10:00 fiction kernel:<6>Mem-info:
2008-06-15T05:04:46.193450+10:00 fiction kernel:<4>Node 0 DMA per-cpu:
2008-06-15T05:04:46.193457+10:00 fiction kernel:<4>cpu 0 hot: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193578+10:00 fiction kernel:<4>cpu 0 cold: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193584+10:00 fiction kernel:<4>cpu 1 hot: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193588+10:00 fiction kernel:<4>cpu 1 cold: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193592+10:00 fiction kernel:<4>cpu 2 hot: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193609+10:00 fiction kernel:<4>cpu 2 cold: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193613+10:00 fiction kernel:<4>cpu 3 hot: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193629+10:00 fiction kernel:<4>cpu 3 cold: high 0,  
batch 1 used:0
2008-06-15T05:04:46.193646+10:00 fiction kernel:<4>Node 0 DMA32 per-cpu:
2008-06-15T05:04:46.193650+10:00 fiction kernel:<4>cpu 0 hot: high  
186, batch 31 used:169
2008-06-15T05:04:46.193668+10:00 fiction kernel:<4>cpu 0 cold: high  
62, batch 15 used:54
2008-06-15T05:04:46.193702+10:00 fiction kernel:<4>cpu 1 hot: high  
186, batch 31 used:172
2008-06-15T05:04:46.193733+10:00 fiction kernel:<4>cpu 1 cold: high  
62, batch 15 used:49
2008-06-15T05:04:46.193737+10:00 fiction kernel:<4>cpu 2 hot: high  
186, batch 31 used:157
2008-06-15T05:04:46.193740+10:00 fiction kernel:<4>cpu 2 cold: high  
62, batch 15 used:49
2008-06-15T05:04:46.193777+10:00 fiction kernel:<4>cpu 3 hot: high  
186, batch 31 used:20
2008-06-15T05:04:46.193781+10:00 fiction kernel:<4>cpu 3 cold: high  
62, batch 15 used:0
2008-06-15T05:04:46.193784+10:00 fiction kernel:<4>Node 0 Normal per- 
cpu:
2008-06-15T05:04:46.193791+10:00 fiction kernel:<4>cpu 0 hot: high  
186, batch 31 used:2
2008-06-15T05:04:46.193795+10:00 fiction kernel:<4>cpu 0 cold: high  
62, batch 15 used:61
2008-06-15T05:04:46.193798+10:00 fiction kernel:<4>cpu 1 hot: high  
186, batch 31 used:19
2008-06-15T05:04:46.193802+10:00 fiction kernel:<4>cpu 1 cold: high  
62, batch 15 used:56
2008-06-15T05:04:46.193805+10:00 fiction kernel:<4>cpu 2 hot: high  
186, batch 31 used:34
2008-06-15T05:04:46.193809+10:00 fiction kernel:<4>cpu 2 cold: high  
62, batch 15 used:59
2008-06-15T05:04:46.193813+10:00 fiction kernel:<4>cpu 3 hot: high  
186, batch 31 used:170
2008-06-15T05:04:46.193818+10:00 fiction kernel:<4>cpu 3 cold: high  
62, batch 15 used:0
2008-06-15T05:04:46.193822+10:00 fiction kernel:<4>Node 0 HighMem per- 
cpu: empty
2008-06-15T05:04:46.193826+10:00 fiction kernel:<4>Free pages:       
139240kB (0kB HighMem)
2008-06-15T05:04:46.193831+10:00 fiction kernel:<4>Active:4955  
inactive:4809 dirty:3 writeback:5 unstable:0 free:34810 slab:342993  
mapped-file:1503 mapped-anon:7239 pagetables:4338
2008-06-15T05:04:46.193836+10:00 fiction kernel:<4>Node 0 DMA free: 
11116kB min:20kB low:24kB high:28kB active:0kB inactive:0kB present: 
10768kB pages_scanned:0 all_unreclaimable? yes
2008-06-15T05:04:46.193840+10:00 fiction kernel:<4>lowmem_reserve[]: 0  
3255 4013 4013
2008-06-15T05:04:46.193846+10:00 fiction kernel:<4>Node 0 DMA32 free: 
80292kB min:6564kB low:8204kB high:9844kB active:0kB inactive:60kB  
present:3334016kB pages_scanned:3384 all_unreclaimable? yes
2008-06-15T05:04:46.193852+10:00 fiction kernel:<4>lowmem_reserve[]: 0  
0 757 757
2008-06-15T05:04:46.193857+10:00 fiction kernel:<4>Node 0 Normal free: 
47832kB min:1524kB low:1904kB high:2284kB active:19820kB inactive: 
19176kB present:775680kB pages_scanned:85543 all_unreclaimable? yes
2008-06-15T05:04:46.193861+10:00 fiction kernel:<4>lowmem_reserve[]: 0  
0 0 0
2008-06-15T05:04:46.193866+10:00 fiction kernel:<4>Node 0 HighMem free: 
0kB min:128kB low:128kB high:128kB active:0kB inactive:0kB present:0kB  
pages_scanned:0 all_unreclaimable? no
2008-06-15T05:04:46.193870+10:00 fiction kernel:<4>lowmem_reserve[]: 0  
0 0 0
2008-06-15T05:04:46.193877+10:00 fiction kernel:<4>Node 0 DMA: 1*4kB  
5*8kB 6*16kB 5*32kB 3*64kB 3*128kB 0*256kB 0*512kB 2*1024kB 0*2048kB  
2*4096kB = 11116kB
2008-06-15T05:04:46.193882+10:00 fiction kernel:<4>Node 0 DMA32:  
19253*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 0*256kB 0*512kB 1*1024kB  
1*2048kB 0*4096kB = 80292kB
2008-06-15T05:04:46.193887+10:00 fiction kernel:<4>Node 0 Normal:  
11774*4kB 0*8kB 0*16kB 1*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB  
0*2048kB 0*4096kB = 47832kB
2008-06-15T05:04:46.193891+10:00 fiction kernel:<4>Node 0 HighMem: empty
2008-06-15T05:04:46.194038+10:00 fiction kernel:<4>9806 pagecache pages
2008-06-15T05:04:46.194053+10:00 fiction kernel:<4>Swap cache: add  
125826, delete 118596, find 42649/62217, race 0+1
2008-06-15T05:04:46.194057+10:00 fiction kernel:<4>Free swap  =  
1919084kB
2008-06-15T05:04:46.194060+10:00 fiction kernel:<4>Total swap =  
2040212kB
2008-06-15T05:04:46.194064+10:00 fiction kernel:<6>Free swap:        
1919084kB
2008-06-15T05:04:46.204652+10:00 fiction kernel:<6>1245184 pages of RAM
2008-06-15T05:04:46.204659+10:00 fiction kernel:<6>233218 reserved pages
2008-06-15T05:04:46.204662+10:00 fiction kernel:<6>28717 pages shared
2008-06-15T05:04:46.204666+10:00 fiction kernel:<6>7648 pages swap  
cached
2008-06-15T05:04:46.204967+10:00 fiction kernel:<3>Out of memory:  
Killed process 22213 (crond).

As far as i can tell there is no rhyme or reason as to which process  
is killed, as i mentioned there were bigger problems early on, but  
while they were going it appeared to affect purely random processes  
(well random appearing to myself). Maybe a pattern will emerge now.

The error messages occur every ten seconds in the logs, continously.

Anything i could check / look into would be appreciated.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ