[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.WNT.2.00.1011301152240.10832@jbrandeb-desk1.amr.corp.intel.com>
Date: Tue, 30 Nov 2010 12:01:53 -0800 (Pacific Standard Time)
From: "Brandeburg, Jesse" <jesse.brandeburg@...el.com>
To: David Miller <davem@...emloft.net>
cc: "eric.dumazet@...il.com" <eric.dumazet@...il.com>,
"therbert@...gle.com" <therbert@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"bhutchings@...arflare.com" <bhutchings@...arflare.com>
Subject: Re: [PATCH net-next-2.6] sched: use xps information for qdisc NUMA
affinity
On Tue, 30 Nov 2010, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@...il.com>
> Date: Tue, 30 Nov 2010 20:07:27 +0100
>
> [ Jesse CC:'d ]
>
> > netdev struct itself is shared by all cpus, so there is no real choice,
> > unless you know one netdev will be used by a restricted set of
> > cpus/nodes... Probably very unlikely in practice.
>
> Unfortunately Jesse has found non-trivial gains by NUMA localizing the
> netdev struct during routing tests in soome configurations.
Thanks Dave, given the results I can see with current upstream (at least
igb) I don't think that netdev access is hurting performance unless the
driver is unwisely accessing netdev structs for write on multiple cpus
simultaneously.
I think the trick is to have drivers that are concerned with this kind of
thing have a "hot path struct" that is used at runtime. Since the cache
on the numa systems will still cache remote node memory for read, if it is
not written to, then the read data will be housed on each cpus' L3.
> > We could change (only on NUMA setups maybe)
> >
> > struct netdev_queue *_tx;
> >
> > to a
> >
> > struct netdev_queue **_tx;
> >
> > and allocate each "struct netdev_queue" on appropriate node, but adding
> > one indirection level might be overkill...
I agree probably overkill.
> > For very hot small structures, (one or two cache lines), I am not sure
> > its worth the pain.
>
> Jesse, do you think this would help the case you were testing?
I would be glad to test, but I am currently seeing pretty good results
with upstream igb. I'll retest with latest kernel and with
# taskset 1 insmod igb.ko
echo 2 > /proc/irq/<igb irqs>/smp_affinity
(1 and 2 are different sockets on my machine)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists