[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47EB46BD.2080001@cosmosbay.com>
Date: Thu, 27 Mar 2008 08:03:25 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: David Miller <davem@...emloft.net>
CC: denys@...p.net.lb, netdev@...r.kernel.org, kaber@...sh.net,
netfilter-devel@...r.kernel.org
Subject: Re: kernel 2.6.25-rc7 highly unstable on high load
David Miller a écrit :
> From: "Denys Fedoryshchenko" <denys@...p.net.lb>
> Date: Thu, 27 Mar 2008 08:35:06 +0200
>
>> It seems i am having very bad luck with 2.6.27. As Linus told, it have to be
>> released soon, but it is crashing like hell on high network load.
>
> That's amazing, you've taken a trip into the future and are running
> 2.6.27 already, please let me borrow your time machine :-)
>
> More seriously, there is obviously something very unique to your
> setup or else everyone would be reporting this crash, and we have
> to find out what that might be.
>
> There seems to be bunch of netfilter stuff in your traces, but
> the top of the trace is somewhere totally unrelated. This is
> a common reoccurance in your crash traces, making them less
> useful than they could be.
>
> I know you asked before what can be done to improve the traces,
> but I'm not an x86 expert so I have no idea how to help you
> in that area.
>
> Patrick, could you see if you can make any sense of his log?
> I see conttrack a lot in the backtraces.
I can see rt_garbage_collect() involved here. This one might explain very long
delays in softirq processing, and eventually crashes...
Denys, could you post :
grep . /proc/sys/net/ipv4/route/*
rtstat -c1 -i10
So that we can check if you should first change route cache tunables :)
>
> Thanks.
>
>> Here is a message i got over syslog on last crash (it was 2.6.25-rc6-git6),
>> available also at http://www.nuclearcat.com/files/crash_2.6.25.txt
>>
>> Mar 26 02:27:14 ROUTER [ 4698.694693] BUG: NMI Watchdog detected LOCKUP
>> Mar 26 02:27:14 ROUTER on CPU1, ip c02ad109, registers:
>> Mar 26 02:27:14 ROUTER [ 4698.694693] Process snmpd (pid: 2327, ti=c092e000
>> task=f7459080 task.ti=f70b7000)
>> Mar 26 02:27:14 ROUTER
>> Mar 26 02:27:14 ROUTER [ 4698.694693] Stack:
>> Mar 26 02:27:14 ROUTER c092eb14
>> Mar 26 02:27:14 ROUTER c011991e
>> Mar 26 02:27:14 ROUTER f750d600
>> Mar 26 02:27:14 ROUTER f750d600
>> Mar 26 02:27:14 ROUTER c0378058
>> Mar 26 02:27:14 ROUTER 00000001
>> Mar 26 02:27:14 ROUTER c092eb34
>> Mar 26 02:27:14 ROUTER c0119b3b
>> Mar 26 02:27:14 ROUTER
>> Mar 26 02:27:14 ROUTER [ 4698.694693]
>> Mar 26 02:27:14 ROUTER 00000000
>> Mar 26 02:27:14 ROUTER 00000001
>> Mar 26 02:27:14 ROUTER 00000082
>> Mar 26 02:27:14 ROUTER f708af88
>> Mar 26 02:27:14 ROUTER c0378058
>> Mar 26 02:27:14 ROUTER 00000001
>> Mar 26 02:27:14 ROUTER c092eb3c
>> Mar 26 02:27:14 ROUTER c0119bfe
>> Mar 26 02:27:14 ROUTER
>> Mar 26 02:27:14 ROUTER [ 4698.694693]
>> Mar 26 02:27:14 ROUTER c092eb50
>> Mar 26 02:27:14 ROUTER c012f19c
>> Mar 26 02:27:14 ROUTER 00000000
>> Mar 26 02:27:14 ROUTER f708af88
>> Mar 26 02:27:14 ROUTER c0378058
>> Mar 26 02:27:14 ROUTER c092eb74
>> Mar 26 02:27:14 ROUTER c011652a
>> Mar 26 02:27:14 ROUTER 00000000
>> Mar 26 02:27:14 ROUTER
>> Mar 26 02:27:14 ROUTER [ 4698.694693] Call Trace:
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011991e>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER task_rq_lock+0x31/0x58
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119b3b>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER try_to_wake_up+0x19/0xd1
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119bfe>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER default_wake_function+0xb/0xd
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c012f19c>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER autoremove_wake_function+0xf/0x33
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011652a>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER __wake_up_common+0x2f/0x5a
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01189b8>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER __wake_up+0x28/0x3b
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01201a3>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER wake_up_klogd+0x2e/0x31
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c012033d>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER release_console_sem+0x197/0x19f
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0120747>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER vprintk+0x295/0x2e5
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<f899634c>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER death_by_timeout+0x8b/0xa3 [nf_conntrack]
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8999d08>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER tcp_packet+0x931/0x9e5 [nf_conntrack]
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01207ac>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER printk+0x15/0x17
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011fb65>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER warn_on_slowpath+0x2a/0x51
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011764a>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER __update_rq_clock+0x1c/0x126
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0116ab3>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER update_curr+0x48/0x64
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<f89961ed>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER nf_ct_invert_tuple+0x63/0x6f [nf_conntrack]
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8996cca>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER nf_conntrack_tuple_taken+0xf8/0x100 [nf_conntrack]
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<f899850c>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER __nf_ct_helper_find+0x2c/0x90 [nf_conntrack]
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8996b95>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER nf_conntrack_alter_reply+0x4a/0x87 [nf_conntrack]
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<f8974976>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER nf_nat_setup_info+0x3cc/0x55a [nf_nat]
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011701c>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER dequeue_rt_entity+0x88/0x171
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0117127>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER dequeue_rt_stack+0x22/0x27
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0117425>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER enqueue_task_rt+0x19/0x2c
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c011617f>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER enqueue_task+0xd/0x18
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01161c0>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER activate_task+0x1e/0x2b
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119bb1>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER try_to_wake_up+0x8f/0xd1
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0119c1b>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER wake_up_process+0xf/0x11
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c013dfa1>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER softlockup_tick+0x9d/0x10b
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0126f5c>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER run_local_timers+0x17/0x19
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c01272fa>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER update_process_times+0x24/0x49
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0135f4c>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER tick_periodic+0x62/0x6e
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0135f71>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER tick_handle_periodic+0x19/0x68
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c010e87b>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER smp_apic_timer_interrupt+0x6c/0x81
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0104344>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER apic_timer_interrupt+0x28/0x30
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c02ad202>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER _spin_lock_bh+0x20/0x22
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c02751fa>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER rt_garbage_collect+0x132/0x27a
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0262d95>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER dst_alloc+0x19/0x63
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0276eb1>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER ip_route_input+0x6b9/0xbd9
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0278898>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER ip_rcv_finish+0x2c/0x29a
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0278ef8>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER ip_rcv+0x202/0x22c
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c025ee4e>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER netif_receive_skb+0x33e/0x3a9
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c02612c2>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER process_backlog+0x62/0xb5
>> Mar 26 02:27:14 ROUTER [ 4698.694693] [<c0260d27>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER net_rx_action+0x8f/0x191
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c01240a7>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER __do_softirq+0x64/0xcd
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0105f0a>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER do_softirq+0x55/0x89
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0123f88>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER local_bh_enable+0x61/0x6d
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0257689>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER lock_sock_nested+0x83/0x8b
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0292e58>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER udp_destroy_sock+0xd/0x20
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0257b9e>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER sk_common_release+0x15/0x60
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c02924a4>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER udp_lib_close+0x8/0xa
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0299006>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER inet_release+0x42/0x48
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c025625b>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER sock_release+0x14/0x60
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c02565d9>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER sock_close+0x29/0x30
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c015a6a2>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER __fput+0x93/0x135
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c015a8e2>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER fput+0x17/0x19
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c01583dc>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER filp_close+0x47/0x51
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0159414>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER sys_close+0x68/0x9d
>> Mar 26 02:27:14 ROUTER [ 4698.694694] [<c0103876>]
>> Mar 26 02:27:14 ROUTER ?
>> Mar 26 02:27:14 ROUTER sysenter_past_esp+0x5f/0x85
>> Mar 26 02:27:14 ROUTER [ 4698.694694] =======================
>> Mar 26 02:27:14 ROUTER [ 4698.694694] Code:
>> Mar 26 02:27:14 ROUTER 94
>> Mar 26 02:27:14 ROUTER c0
>> Mar 26 02:27:14 ROUTER 84
>> Mar 26 02:27:14 ROUTER c0
>> Mar 26 02:27:14 ROUTER b9
>> Mar 26 02:27:14 ROUTER 01
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 75
>> Mar 26 02:27:14 ROUTER 09
>> Mar 26 02:27:14 ROUTER f0
>> Mar 26 02:27:14 ROUTER 81
>> Mar 26 02:27:14 ROUTER 02
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 01
>> Mar 26 02:27:14 ROUTER 30
>> Mar 26 02:27:14 ROUTER c9
>> Mar 26 02:27:14 ROUTER 5d
>> Mar 26 02:27:14 ROUTER 89
>> Mar 26 02:27:14 ROUTER c8
>> Mar 26 02:27:14 ROUTER c3
>> Mar 26 02:27:14 ROUTER 55
>> Mar 26 02:27:14 ROUTER ba
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 01
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 89
>> Mar 26 02:27:14 ROUTER e5
>> Mar 26 02:27:14 ROUTER f0
>> Mar 26 02:27:14 ROUTER 66
>> Mar 26 02:27:14 ROUTER 0f
>> Mar 26 02:27:14 ROUTER c1
>> Mar 26 02:27:14 ROUTER 10
>> Mar 26 02:27:14 ROUTER 38
>> Mar 26 02:27:14 ROUTER f2
>> Mar 26 02:27:14 ROUTER 74
>> Mar 26 02:27:14 ROUTER 06
>> Mar 26 02:27:14 ROUTER f3
>> Mar 26 02:27:14 ROUTER 90
>> Mar 26 02:27:14 ROUTER unparseable log message: "<8a> "
>> Mar 26 02:27:14 ROUTER 10
>> Mar 26 02:27:14 ROUTER eb
>> Mar 26 02:27:14 ROUTER f6
>> Mar 26 02:27:14 ROUTER 5d
>> Mar 26 02:27:14 ROUTER c3
>> Mar 26 02:27:14 ROUTER 55
>> Mar 26 02:27:14 ROUTER 89
>> Mar 26 02:27:14 ROUTER e5
>> Mar 26 02:27:14 ROUTER f0
>> Mar 26 02:27:14 ROUTER 81
>> Mar 26 02:27:14 ROUTER 28
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 00
>> Mar 26 02:27:14 ROUTER 01
>> Mar 26 02:27:14 ROUTER 74
>> Mar 26 02:27:14 ROUTER 05
>> Mar 26 02:27:14 ROUTER e8
>> Mar 26 02:27:14 ROUTER 64
>> Mar 26 02:27:14 ROUTER fd
>> Mar 26 02:27:14 ROUTER
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists