lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 04 Mar 2009 09:36:57 +0100
From:	Eric Dumazet <dada1@...mosbay.com>
To:	David Miller <davem@...emloft.net>
CC:	kchang@...enacr.com, netdev@...r.kernel.org,
	cl@...ux-foundation.org
Subject: Re: Multicast packet loss

David Miller a écrit :
> From: Eric Dumazet <dada1@...mosbay.com>
> Date: Sat, 28 Feb 2009 09:51:11 +0100
> 
>> David, this is a preliminary work, not meant for inclusion as is,
>> comments are welcome.
>>
>> [PATCH] net: sk_forward_alloc becomes an atomic_t
>>
>> Commit 95766fff6b9a78d11fc2d3812dd035381690b55d
>> (UDP: Add memory accounting) introduced a regression for high rate UDP flows,
>> because of extra lock_sock() in udp_recvmsg()
>>
>> In order to reduce need for lock_sock() in UDP receive path, we might need
>> to declare sk_forward_alloc as an atomic_t.
>>
>> udp_recvmsg() can avoid a lock_sock()/release_sock() pair.
>>
>> Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
> 
> This adds new overhead for TCP which has to hold the socket
> lock for other reasons in these paths.
> 
> I don't get how an atomic_t operation is cheaper than a
> lock_sock/release_sock.  Is it the case that in many
> executions of these paths only atomic_read()'s are necessary?
> 
> I actually think this scheme is racy.  There is a reason we
> have to hold the socket lock when doing memory scheduling.
> Two threads can get in there and say "hey I have enough space
> already" even though only enough space is allocated for one
> of their requests.
> 
> What did I miss? :)
> 

I believe you are right, and in fact was about to post a "dont look at this patch"
since it doesnt help the multicast reception at all, I redone tests more carefuly 
and got nothing but noise.

We have a cache line ping pong mess here, and need more thinking.

I rewrote Kenny prog to use non blocking sockets.

Receivers are doing :

        int delay = 50;
	fcntl(s, F_SETFL, O_NDELAY);
        while(1)
        {
            struct sockaddr_in from;
            socklen_t fromlen = sizeof(from);
            res = recvfrom(s, buf, 1000, 0, (struct sockaddr*)&from, &fromlen);
            if (res == -1) {
                      delay++;
                      usleep(delay);
                      continue;
            }
            if (delay > 40)
                delay--;
            ++npackets;

With this litle user space change and 8 receivers on my dual quad core, softirqd
only takes 8% of one cpu and no drops at all (instead of 100% cpu and 30% drops)

So this is definitly a problem mixing scheduler cache line ping pongs with network
stack cache line ping pongs.

We could reorder fields so that fewer cache lines are touched by the softirq processing,
I tried this but still got packet drops.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists