[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0904231036400.4832@jens.its.uu.se>
Date: Thu, 23 Apr 2009 11:12:49 +0200 (CEST)
From: Jens Laas <jens.laas@....uu.se>
To: Jesper Dangaard Brouer <hawk@...u.dk>
cc: David Miller <davem@...emloft.net>, therbert@...gle.com,
shemminger@...tta.com, Eric Dumazet <dada1@...mosbay.com>,
andi@...stfloor.org, netdev <netdev@...r.kernel.org>,
Robert Olsson <Robert.Olsson@...a.slu.se>,
Jens Laas <jens.laas@....UU.SE>, hawk@...x.dk,
jens.axboe@...cle.com
Subject: Re: [PATCH] Software receive packet steering
(09.04.22 kl.22:44) Jesper Dangaard Brouer skrev följande till David Miller:
> On Wed, 22 Apr 2009, David Miller wrote:
>
>> One thought I keep coming back to is the hack the block layer
>> is using right now. It remembers which CPU a block I/O request
>> comes in on, and it makes sure the completion runs on that
>> cpu too.
>
> This is also very important for routing performance.
>
> Experiences from practical 10GbE routing tests (done by Roberts team and my
> self), reveals that we can only achieve (close to) 10Gbit/s routing
> performance when carefully making sure that the rx-queue and tx-queue runs on
> the same CPU. (Not doing so really kills performance).
>
> Currently I'm using some patches by Jens Låås, that allows userspace to setup
> the rx-queue to tx-queues mapping, plus manual smp_affinity tuning. The
> problem with this approach is that it requires way too much manual tuning
> from userspace to achieve good performance.
We have a C-program for setting the affinity correctly. Note that
"correctly" very much depends on your setup and what you want to do.
We started with a script for doing this, but its a bit easier to implement
some heuristics in a proper program.
The patch (which implements a concept called "flowtrunks") also requires
setup from userspace (via ethtool ioctl). We dont actually use this yet in
production.
The natural way to go forward would be to implement in userspace a program
that can tune smp_affinity and queue-mapping (maybe via flowtrunks)
together. With knowledge of the setup and userpreference this should be
doable to automatically tune your system for you.
One advantage with flowtrunks (generic queue/nic to/from flowtrunk
mapping) would be for us to not have to patch every supported nic.
Plus we could tune the system for more than one usecase (forwarding
between multi-queue nics).
The main object of the flowtrunk patch was to try to start a discussion
and create something concrete to help our thinking.
This problem-space needs to be explored.
>
> I would like to see an approach with less manual tuning, as we basically
> "just" need to make sure that TX completion is done on the same CPU as RX. I
> would like to see some effort in this area and is willing to partisipate
> actively.
I dont see a problem with tuning from userspace. I think it will be hard
for the kernel to automatically tune all types of setups for all usecases.
Maybe Im just lacking in imagination though.
Cheers,
Jens
>
> Cheers,
> Jesper Brouer
>
> --
> -------------------------------------------------------------------
> MSc. Master of Computer Science
> Dept. of Computer Science, University of Copenhagen
> Author of http://www.adsl-optimizer.dk
> -------------------------------------------------------------------
-----------------------------------------------------------------------
'In theory, there is no difference between theory and practice.
But, in practice, there is.'
-----------------------------------------------------------------------
Jens Låås Email: jens.laas@....uu.se
ITS Phone: +46 18 471 77 03
SWEDEN
-----------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists