[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <47EC399E.90804@sun.com>
Date: Thu, 27 Mar 2008 17:19:42 -0700
From: Matheos Worku <Matheos.Worku@....COM>
To: David Miller <davem@...emloft.net>
Cc: jesse.brandeburg@...el.com, jarkao2@...il.com,
netdev@...r.kernel.org
Subject: Re: 2.6.24 BUG: soft lockup - CPU#X
David Miller wrote:
> From: Matheos Worku <Matheos.Worku@....COM>
> Date: Thu, 27 Mar 2008 16:45:06 -0700
>
>
>> Brandeburg, Jesse wrote:
>>
>>> Jarek Poplawski wrote:
>>>
>>>
>>>> On Wed, Mar 26, 2008 at 01:26:00PM -0700, Matheos Worku wrote:
>>>> ...
>>>>
>>>>
>>>>> nsn57-110 login: BUG: soft lockup - CPU#2 stuck for 11s! ... Call
>>>>> Trace: [<ffffffff803ef5f6>] __skb_clone+0x24/0xdc
>>>>> [<ffffffff803f152e>] skb_realloc_headroom+0x30/0x63
>>>>> [<ffffffff882edd40>] :niu:niu_start_xmit+0x114/0x5af
>>>>> [<ffffffff80221995>] gart_map_single+0x0/0x70
>>>>> [<ffffffff803f5e2b>] dev_hard_start_xmit+0x1d2/0x246 ...
>>>>>
>>>>>
>>>> Maybe I'm wrong with this again, but I wonder about this
>>>> gart_map_single on almost all traces, and probably not supposed to be
>>>> seen here. Did you try with some memory re-config/debugging?
>>>>
>>>>
>>> I have some more examples of this but with the ixgbe driver. We are
>>> running heavy bidirectional stress with multiple rx (non-napi, yeah I
>>> know) interrupts by default (and userspace irqbalance is probably on,
>>> I'll have the lab try it without)
>>>
>>>
>> I have seen the lockup on kernels 2.6.18 and newer mostly on TX traffic.
>> I have seen it on another 10G driver (off the tree niu driver sibling,
>> nxge). The nxge driver doesn't use any TX interrupts and I have seen it
>> with UDP TX, irqbalance disabled, with no irq activity at all. some
>> example traces included.
>>
>
> Interesting.
>
> Are you running uperf in a way such that there are multiple
> processors doing TX's in parallel? That might be a clue.
>
Dave,
Actually I am running a version of the nxge driver which uses only one
TX ring, no LLTX enabled so the driver does single threaded TX. On the
other hand, uperf (or iperf, netperf ) is running multiple TX
connections in parallel and the connections are bound on multiple
processors, hence they are running in parallel.
Regards
Matheos
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists