netdev - Re: [PATCH net-next-2.6] net: Xmit Packet Steering (XPS)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4B0621FC.6060004@gmail.com>
Date:	Fri, 20 Nov 2009 05:58:36 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Changli Gao <xiaosuo@...il.com>
CC:	"David S. Miller" <davem@...emloft.net>,
	Tom Herbert <therbert@...gle.com>,
	Linux Netdev List <netdev@...r.kernel.org>
Subject: Re: [PATCH net-next-2.6] net: Xmit Packet Steering (XPS)

Changli Gao a écrit :
> On Fri, Nov 20, 2009 at 7:46 AM, Eric Dumazet <eric.dumazet@...il.com> wrote:
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 9977288..9e134f6 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -2000,6 +2001,7 @@ gso:
>>         */
>>        rcu_read_lock_bh();
>>
>> +       skb->sending_cpu = cpu = smp_processor_id();
>>        txq = dev_pick_tx(dev, skb);
>>        q = rcu_dereference(txq->qdisc);
> 
> I think assigning cpu to skb->sending_cpu just before calling
> hard_start_xmit is better, because the CPU which dequeues the skb will
> be another one.

I want to record the application CPU, because I want the application CPU
to call sock_wfree(), not the CPU that happened to dequeue skb to transmit it
in case of txq contention.

> 
>> @@ -2024,8 +2026,6 @@ gso:
>>           Either shot noqueue qdisc, it is even simpler 8)
>>         */
>>        if (dev->flags & IFF_UP) {
>> -               int cpu = smp_processor_id(); /* ok because BHs are off */
>> -
>>                if (txq->xmit_lock_owner != cpu) {
>>
>>                        HARD_TX_LOCK(dev, txq, cpu);
>> @@ -2967,7 +2967,7 @@ static void net_rx_action(struct softirq_action *h)
>>        }
>>  out:
>>        local_irq_enable();
>> -
>> +       xps_flush();
> 
> If there isn't any new skbs, the memory will be hold forever. I know
> you want to eliminate unnecessary IPI, how about sending IPI only when
> the remote xps_pcpu_queues are changed from empty to nonempty?

I dont understand your remark, and dont see the problem, yet.

I send IPI only on cpus I know I have at least one skb queueud for them.
For each cpu taking TX completion interrupts I have :

One bitmask (xps_cpus) of cpus I will eventually send IPI at end of net_rx_action()

One array of skb lists per remote cpu, allocated on cpu node memory, thanks
to __alloc_percpu() at boot time.

I say _eventually_ because the algo is :

+		if (cpu_online(cpu)) {
+			spin_lock(&q->list.lock);
+			prevlen = skb_queue_len(&q->list);
+			skb_queue_splice_init(&head[cpu], &q->list);
+			spin_unlock(&q->list.lock);
+			/*
+			 * We hope remote cpu will be fast enough to transfert
+			 * this list to its completion queue before our
+			 * next xps_flush() call
+			 */
+			if (!prevlen)
+				__smp_call_function_single(cpu, &q->csd, 0);
+			continue;

So I send an IPI only if needed, once for the whole skb list.

With my pktgen (no skb cloning setup) tests, and

ethtool -C eth3 tx-usecs 1000 tx-frames 100

I really saw batches of 100 frames given from CPU X (NIC interrupts) to CPU Y (pktgen cpu)

What memory is hold forever ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html