[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4B0621FC.6060004@gmail.com>
Date: Fri, 20 Nov 2009 05:58:36 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Changli Gao <xiaosuo@...il.com>
CC: "David S. Miller" <davem@...emloft.net>,
Tom Herbert <therbert@...gle.com>,
Linux Netdev List <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next-2.6] net: Xmit Packet Steering (XPS)
Changli Gao a écrit :
> On Fri, Nov 20, 2009 at 7:46 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 9977288..9e134f6 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -2000,6 +2001,7 @@ gso:
>> */
>> rcu_read_lock_bh();
>>
>> + skb->sending_cpu = cpu = smp_processor_id();
>> txq = dev_pick_tx(dev, skb);
>> q = rcu_dereference(txq->qdisc);
>
> I think assigning cpu to skb->sending_cpu just before calling
> hard_start_xmit is better, because the CPU which dequeues the skb will
> be another one.
I want to record the application CPU, because I want the application CPU
to call sock_wfree(), not the CPU that happened to dequeue skb to transmit it
in case of txq contention.
>
>> @@ -2024,8 +2026,6 @@ gso:
>> Either shot noqueue qdisc, it is even simpler 8)
>> */
>> if (dev->flags & IFF_UP) {
>> - int cpu = smp_processor_id(); /* ok because BHs are off */
>> -
>> if (txq->xmit_lock_owner != cpu) {
>>
>> HARD_TX_LOCK(dev, txq, cpu);
>> @@ -2967,7 +2967,7 @@ static void net_rx_action(struct softirq_action *h)
>> }
>> out:
>> local_irq_enable();
>> -
>> + xps_flush();
>
> If there isn't any new skbs, the memory will be hold forever. I know
> you want to eliminate unnecessary IPI, how about sending IPI only when
> the remote xps_pcpu_queues are changed from empty to nonempty?
I dont understand your remark, and dont see the problem, yet.
I send IPI only on cpus I know I have at least one skb queueud for them.
For each cpu taking TX completion interrupts I have :
One bitmask (xps_cpus) of cpus I will eventually send IPI at end of net_rx_action()
One array of skb lists per remote cpu, allocated on cpu node memory, thanks
to __alloc_percpu() at boot time.
I say _eventually_ because the algo is :
+ if (cpu_online(cpu)) {
+ spin_lock(&q->list.lock);
+ prevlen = skb_queue_len(&q->list);
+ skb_queue_splice_init(&head[cpu], &q->list);
+ spin_unlock(&q->list.lock);
+ /*
+ * We hope remote cpu will be fast enough to transfert
+ * this list to its completion queue before our
+ * next xps_flush() call
+ */
+ if (!prevlen)
+ __smp_call_function_single(cpu, &q->csd, 0);
+ continue;
So I send an IPI only if needed, once for the whole skb list.
With my pktgen (no skb cloning setup) tests, and
ethtool -C eth3 tx-usecs 1000 tx-frames 100
I really saw batches of 100 frames given from CPU X (NIC interrupts) to CPU Y (pktgen cpu)
What memory is hold forever ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists