[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1259087601.2631.56.camel@ppwaskie-mobl2>
Date: Tue, 24 Nov 2009 10:33:21 -0800
From: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@...el.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: David Miller <davem@...emloft.net>,
"peterz@...radead.org" <peterz@...radead.org>,
"arjan@...ux.intel.com" <arjan@...ux.intel.com>,
"yong.zhang0@...il.com" <yong.zhang0@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"arjan@...ux.jf.intel.com" <arjan@...ux.jf.intel.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH] irq: Add node_affinity CPU masks for smarter
irqbalance hints
On Tue, 2009-11-24 at 10:26 -0800, Eric Dumazet wrote:
> Peter P Waskiewicz Jr a écrit :
> That's the kind of thing PJ is trying to make available.
> >
> > Yes, that's exactly what I'm trying to do. Even further, we want to
> > allocate the ring SW struct itself and descriptor structures on other
> > NUMA nodes, and make sure the interrupt lines up with those allocations.
> >
>
> Say you allocate ring buffers on NUMA node of the CPU handling interrupt
> on a particular queue.
>
> If irqbalance or an admin changes /proc/irq/{number}/smp_affinities,
> do you want to realloc ring buffer to another NUMA node ?
>
That's why I'm trying to add the node_affinity mechanism that irqbalance
can use to prevent the interrupt being moved to another node.
> It seems complex to me, maybe optimal thing would be to use a NUMA policy to
> spread vmalloc() allocations to all nodes to get a good bandwidth...
That's exactly what we're doing in our 10GbE driver right now (isn't
pushed upstream yet, still finalizing our testing). We spread to all
NUMA nodes in a semi-intelligent fashion when allocating our rings and
buffers. The last piece is ensuring the interrupts tied to the various
queues all route to the NUMA nodes those CPUs belong to. irqbalance
needs some kind of hint to make sure it does the right thing, which
today it does not.
I don't see how this is complex though. Driver loads, allocates across
the NUMA nodes for optimal throughput, then writes CPU masks for the
NUMA nodes each interrupt belongs to. irqbalance comes along and looks
at the new mask "hint," and then balances that interrupt within that
hinted mask.
Cheers,
-PJ
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists