[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49B2266C.9050701@cosmosbay.com>
Date: Sat, 07 Mar 2009 08:46:52 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: kchang@...enacr.com
CC: David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
cl@...ux-foundation.org, Brian Bloniarz <bmb@...enacr.com>
Subject: Re: Multicast packet loss
Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Eric Dumazet <dada1@...mosbay.com>
>> Date: Sat, 28 Feb 2009 09:51:11 +0100
>>
>>> David, this is a preliminary work, not meant for inclusion as is,
>>> comments are welcome.
>>>
>>> [PATCH] net: sk_forward_alloc becomes an atomic_t
>>>
>>> Commit 95766fff6b9a78d11fc2d3812dd035381690b55d
>>> (UDP: Add memory accounting) introduced a regression for high rate UDP flows,
>>> because of extra lock_sock() in udp_recvmsg()
>>>
>>> In order to reduce need for lock_sock() in UDP receive path, we might need
>>> to declare sk_forward_alloc as an atomic_t.
>>>
>>> udp_recvmsg() can avoid a lock_sock()/release_sock() pair.
>>>
>>> Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
>> This adds new overhead for TCP which has to hold the socket
>> lock for other reasons in these paths.
>>
>> I don't get how an atomic_t operation is cheaper than a
>> lock_sock/release_sock. Is it the case that in many
>> executions of these paths only atomic_read()'s are necessary?
>>
>> I actually think this scheme is racy. There is a reason we
>> have to hold the socket lock when doing memory scheduling.
>> Two threads can get in there and say "hey I have enough space
>> already" even though only enough space is allocated for one
>> of their requests.
>>
>> What did I miss? :)
>>
>
> I believe you are right, and in fact was about to post a "dont look at this patch"
> since it doesnt help the multicast reception at all, I redone tests more carefuly
> and got nothing but noise.
>
> We have a cache line ping pong mess here, and need more thinking.
>
> I rewrote Kenny prog to use non blocking sockets.
>
> Receivers are doing :
>
> int delay = 50;
> fcntl(s, F_SETFL, O_NDELAY);
> while(1)
> {
> struct sockaddr_in from;
> socklen_t fromlen = sizeof(from);
> res = recvfrom(s, buf, 1000, 0, (struct sockaddr*)&from, &fromlen);
> if (res == -1) {
> delay++;
> usleep(delay);
> continue;
> }
> if (delay > 40)
> delay--;
> ++npackets;
>
> With this litle user space change and 8 receivers on my dual quad core, softirqd
> only takes 8% of one cpu and no drops at all (instead of 100% cpu and 30% drops)
>
> So this is definitly a problem mixing scheduler cache line ping pongs with network
> stack cache line ping pongs.
>
> We could reorder fields so that fewer cache lines are touched by the softirq processing,
> I tried this but still got packet drops.
>
>
>
I have more questions :
What is the maximum latency you can afford on the delivery of the packet(s) ?
Are user apps using real time scheduling ?
I had an idea, that keep cpu handling NIC interrupts only delivering packets to
socket queues, and not messing with scheduler : fast queueing, and wakeing up
a workqueue (on another cpu) to perform the scheduler work. But that means
some extra latency (in the order of 2 or 3 us I guess)
We could enter in this mode automatically, if the NIC rx handler *see* more than
N packets are waiting in NIC queue : In case of moderate or light trafic, no
extra latency would be necessary. This would mean some changes in NIC driver.
Hum, then, if NIC rx handler is run beside the ksoftirqd, we already know
we are in a stress situation, so maybe no driver changes are necessary :
Just test if we run ksoftirqd...
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists