[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1271271222.4567.51.camel@bigi>
Date: Wed, 14 Apr 2010 14:53:42 -0400
From: jamal <hadi@...erus.ca>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Tom Herbert <therbert@...gle.com>, netdev@...r.kernel.org,
robert@...julf.net, David Miller <davem@...emloft.net>,
Changli Gao <xiaosuo@...il.com>,
Andi Kleen <andi@...stfloor.org>
Subject: Re: rps perfomance WAS(Re: rps: question
On Wed, 2010-04-14 at 20:04 +0200, Eric Dumazet wrote:
> Yes, multiqueue is far better of course, but in case of hardware lacking
> multiqueue, RPS can help many workloads, where application has _some_
> work to do, not only counting frames or so...
Agreed. So to enumerate, the benefits come in if:
a) you have many processors
b) you have single-queue nic
c) at sub-threshold traffic you dont care about a little latency
d) you have a specific cache hierachy
e) app is working hard to process incoming messages
> RPS overhead (IPI, cache misses, ...) must be amortized by
> parallelization or we lose.
Indeed.
How well they can be amortized seems very cpu or board specific.
I think the main challenge for my pedantic mind is missing details. Is
there a paper on rps? Example for #d above, the commit log mentions that
rps benefits if you have certain types of "cache hierachy". Probably
some arch with large shared L2/3 (maybe inclusive) cache will benefit.
example: it does well on Nehalem and probably opterons as long (as you
dont start stacking these things on some interconnect like QPI or HT).
But what happens when you have FSB sharing across cores (still a very
common setup)? etc etc
Can I ask what hardware you run this on?
> A ping test is not an ideal candidate for RPS, since everything is done
> at softirq level, and should be faster without RPS...
ping wont do justice to the possible potential of rps mostly because it
generates very little traffic i.e the part #c above. But it helps me at
least boot a machine with proper setup - but it is not totally useless
because i think the cost of IPI can be deduced from the results.
I am going to put together some udp app with variable think-time to see
what happens. Would that be a reasonable thing to test on?
It would be valuable to have something like Documentation/networking/rps
to detail things a little more.
cheers,
jamal
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists