[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <478B943C.7080009@cosmosbay.com>
Date: Mon, 14 Jan 2008 17:56:28 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: Chris Friesen <cfriesen@...tel.com>
Cc: Ray Lee <ray-lk@...rabbit.org>, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: questions on NAPI processing latency and dropped network packets
Chris Friesen a écrit :
> Ray Lee wrote:
>> On Jan 10, 2008 9:24 AM, Chris Friesen <cfriesen@...tel.com> wrote:
>
>>> After a recent userspace app change, we've started seeing packets being
>>> dropped by the ethernet hardware (e1000, NAPI is enabled). The
>>> error/dropped/fifo counts are going up in ethtool:
>
>> Can you reproduce it with a simple userspace cpu hog? (Two, really,
>> one per cpu.)
>> Can you reproduce it with the newer e1000?
>
> Hmm...good questions and I haven't checked either. The first one is
> relatively straightforward. The second is a bit trickier...last time
> I tried the latest e1000 driver the card wouldn't boot (we use netboot).
>
>> Can you reproduce it with git head?
>
> Unfortunately, I don't think I'll be able to try this. We require
> kernel mods for our userspace to run, and I doubt I'd be able to get
> the time to port all the changes forward to git head.
>
>> If the answer to the first one is yes, the last no, then bisect until
>> you get a kernel that doesn't show the problem. Backport the fix,
>> unless the fix happens to be CFS. However, I suspect that your
>> userpace app is just starving the system from time to time.
>
> It's conceivable that userspace is starving the kernel, but we have do
> about 45% idle on one cpu, and 7-10% idle on the other.
>
> We also have an odd situation where on an initial test run after
> bootup we have 18-24% idle on cpu1, but resetting the test tool drops
> that to the 7-10% I mentioned above.
>
> Based on profiling and instrumentation it seems like the cost of
> sctp_endpoint_lookup_assoc() more than triples, which means that the
> amount of time that bottom halves are disabled in that function also
> triples.
Any idea of the size of sctp hash size you have ?
(your dmesg probably includes a message starting with SCTP: Hash tables
configured...
How many concurrent sctp sockets are handled ?
Maybe sctp_assoc_hashfn() is too weak for your use, and some chains are
*really* long.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists