[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1272540952.4258.161.camel@bigi>
Date: Thu, 29 Apr 2010 07:35:52 -0400
From: jamal <hadi@...erus.ca>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>, xiaosuo@...il.com,
therbert@...gle.com, shemminger@...tta.com, netdev@...r.kernel.org,
Eilon Greenstein <eilong@...adcom.com>,
Brian Bloniarz <bmb@...enacr.com>
Subject: Re: [PATCH net-next-2.6] net: speedup udp receive path
On Thu, 2010-04-29 at 06:09 +0200, Eric Dumazet wrote:
> I dont see in your results the number of pps, number of udp ports,
> number of flows.
My test scenario is still the same: send 1M packets of 8 flows
round-robin at 750Kpps. Repeat test 4-6 times and average out. 8 flows
map to 8 cpus. Any rate above 750Kpps and the driver starts dropping.
The flows are {Fixed dst IP, fixed src IP, fixed src port, 8 variable
dst port}. ip_rcv and friends show up in profile as we have already
discussed - but i dont want to change the test characteristic because i
cant do fair backward comparison. Also i use rps mask ee to use all the
cpus except the core doing demux (core 0).
In the results when i say "udp sink 90%" it means 90% of 750Kpps was
successfuly received by the app (on the multiple cpus).
> In my latest results, I can handle more pps than before, regardless of
> rps being on or off,
Same here - even in my worst case scenario 88.5% of 750Kpps > 600Kpps.
Attached is history results to make more sense of what i am saying:
we have net-next kernels from apr14, apr23, apr23 with changlis change,
apr28, apr28 with your change. What you'll see is non-rps (blue) gets
better and rps (Orange) gets better slowly then by apr28 it is worse.
> and with various number of udp ports (one user
> thread per port), number of flows (many src addr so that rps spread
> packets on many cpus)
>
This is true for me except for non rps getting relatively better and rps
getting worse in plain net-next for Apr 28. Sorry, dont have time to
dissect where things changed but i figured if i reported it will point
to something obvious.
> If/when contention windows are smaller, cpu can run uncontended, and can
> consume more cycles to process more frames ?
>
> With a non yet published patch, I even can reach 600.000 pps in DDOS
> situations, instead of 400.000.
So my tests are simpler. What i was hoping to see was at minimum rps
maintains its gap of 6-7% more capacity. I dont mind seeing
rps get better. If both rps and non-rps get better that even more
interesting.
cheers,
jamal
Download attachment "rps-hist.pdf" of type "application/pdf" (212033 bytes)
Powered by blists - more mailing lists