lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49B2266C.9050701@cosmosbay.com>
Date:	Sat, 07 Mar 2009 08:46:52 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	kchang@...enacr.com
CC:	David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
	cl@...ux-foundation.org, Brian Bloniarz <bmb@...enacr.com>
Subject: Re: Multicast packet loss

Eric Dumazet a écrit :
> David Miller a écrit :
>> From: Eric Dumazet <dada1@...mosbay.com>
>> Date: Sat, 28 Feb 2009 09:51:11 +0100
>>
>>> David, this is a preliminary work, not meant for inclusion as is,
>>> comments are welcome.
>>>
>>> [PATCH] net: sk_forward_alloc becomes an atomic_t
>>>
>>> Commit 95766fff6b9a78d11fc2d3812dd035381690b55d
>>> (UDP: Add memory accounting) introduced a regression for high rate UDP flows,
>>> because of extra lock_sock() in udp_recvmsg()
>>>
>>> In order to reduce need for lock_sock() in UDP receive path, we might need
>>> to declare sk_forward_alloc as an atomic_t.
>>>
>>> udp_recvmsg() can avoid a lock_sock()/release_sock() pair.
>>>
>>> Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
>> This adds new overhead for TCP which has to hold the socket
>> lock for other reasons in these paths.
>>
>> I don't get how an atomic_t operation is cheaper than a
>> lock_sock/release_sock.  Is it the case that in many
>> executions of these paths only atomic_read()'s are necessary?
>>
>> I actually think this scheme is racy.  There is a reason we
>> have to hold the socket lock when doing memory scheduling.
>> Two threads can get in there and say "hey I have enough space
>> already" even though only enough space is allocated for one
>> of their requests.
>>
>> What did I miss? :)
>>
> 
> I believe you are right, and in fact was about to post a "dont look at this patch"
> since it doesnt help the multicast reception at all, I redone tests more carefuly 
> and got nothing but noise.
> 
> We have a cache line ping pong mess here, and need more thinking.
> 
> I rewrote Kenny prog to use non blocking sockets.
> 
> Receivers are doing :
> 
>         int delay = 50;
> 	fcntl(s, F_SETFL, O_NDELAY);
>         while(1)
>         {
>             struct sockaddr_in from;
>             socklen_t fromlen = sizeof(from);
>             res = recvfrom(s, buf, 1000, 0, (struct sockaddr*)&from, &fromlen);
>             if (res == -1) {
>                       delay++;
>                       usleep(delay);
>                       continue;
>             }
>             if (delay > 40)
>                 delay--;
>             ++npackets;
> 
> With this litle user space change and 8 receivers on my dual quad core, softirqd
> only takes 8% of one cpu and no drops at all (instead of 100% cpu and 30% drops)
> 
> So this is definitly a problem mixing scheduler cache line ping pongs with network
> stack cache line ping pongs.
> 
> We could reorder fields so that fewer cache lines are touched by the softirq processing,
> I tried this but still got packet drops.
> 
> 
> 

I have more questions :

What is the maximum latency you can afford on the delivery of the packet(s) ?

Are user apps using real time scheduling ?

I had an idea, that keep cpu handling NIC interrupts only delivering packets to
socket queues, and not messing with scheduler : fast queueing, and wakeing up
a workqueue (on another cpu) to perform the scheduler work. But that means
some extra latency (in the order of 2 or 3 us I guess)

We could enter in this mode automatically, if the NIC rx handler *see* more than
N packets are waiting in NIC queue : In case of moderate or light trafic, no
extra latency would be necessary. This would mean some changes in NIC driver.

Hum, then, if NIC rx handler is run beside the ksoftirqd, we already know
we are in a stress situation, so maybe no driver changes are necessary :
Just test if we run ksoftirqd...


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ