[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <47ED241F.9080003@sun.com>
Date: Fri, 28 Mar 2008 10:00:15 -0700
From: Matheos Worku <Matheos.Worku@....COM>
To: hadi@...erus.ca
Cc: Herbert Xu <herbert@...dor.apana.org.au>,
David Miller <davem@...emloft.net>, jesse.brandeburg@...el.com,
jarkao2@...il.com, netdev@...r.kernel.org
Subject: Re: 2.6.24 BUG: soft lockup - CPU#X
jamal wrote:
> On Thu, 2008-27-03 at 18:58 -0700, Matheos Worku wrote:
>
>
>> In general, while the TX serialization improves performance in terms to
>> lock contention, wouldn't it reduce throughput since only one guy is
>> doing the actual TX at any given time. Wondering if it would be
>> worthwhile to have an enable/disable option specially for multi queue TX.
>>
>
> Empirical evidence so far says at some point the bottleneck is going to
> be the wire i.e modern CPUs are "fast enough" that sooner than later
> they will fill up the DMA ring of transmitting driver and go back to
> doing other things.
>
> It is hard to create the condition you seem to have come across. I had access to a dual core opteron but found it very hard with parallel UDP
> sessions to keep the TX CPU locked in that region (while the other 3
> were busy pumping packets). My folly could have been that i had a Gige
> wire and maybe a 10G would have recreated the condition.
> If you can reproduce this at will, can you try to reduce the number of
> sending TX u/iperfs and see when it begins to happen?
> Are all the iperfs destined out of the same netdevice?
>
I am using 10G nic at this time. With the same driver, I haven't come
across the lockup on 1G nic though I haven't really tried to reproduce
it. Regarding the number of connection it takes to create the
situation, I have noticed the lockup at 3 or more udp connections.
Also, with TSO disabled, I have came across it with lots of TCP connections.
> [Typically the TX path on the driver side is inefficient either because
> of coding (ex: unnecessary locks) or expensive IO. But this has not
> mattered much thus far (given fast enough CPUs).
>
That could be true though oprofile is not providing obvious clues,
alteast not yet.
> It all could be improved by reducing the per packet operations the
> driver incurs - as an example, the CPU (to the driver) could batch a
> set of packet to the device then kick the device DMA once for the batch
> etc.]
>
Regards
matheos
> cheers,
> jamal
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists