lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 04 Mar 2009 09:36:57 +0100 From: Eric Dumazet <dada1@...mosbay.com> To: David Miller <davem@...emloft.net> CC: kchang@...enacr.com, netdev@...r.kernel.org, cl@...ux-foundation.org Subject: Re: Multicast packet loss David Miller a écrit : > From: Eric Dumazet <dada1@...mosbay.com> > Date: Sat, 28 Feb 2009 09:51:11 +0100 > >> David, this is a preliminary work, not meant for inclusion as is, >> comments are welcome. >> >> [PATCH] net: sk_forward_alloc becomes an atomic_t >> >> Commit 95766fff6b9a78d11fc2d3812dd035381690b55d >> (UDP: Add memory accounting) introduced a regression for high rate UDP flows, >> because of extra lock_sock() in udp_recvmsg() >> >> In order to reduce need for lock_sock() in UDP receive path, we might need >> to declare sk_forward_alloc as an atomic_t. >> >> udp_recvmsg() can avoid a lock_sock()/release_sock() pair. >> >> Signed-off-by: Eric Dumazet <dada1@...mosbay.com> > > This adds new overhead for TCP which has to hold the socket > lock for other reasons in these paths. > > I don't get how an atomic_t operation is cheaper than a > lock_sock/release_sock. Is it the case that in many > executions of these paths only atomic_read()'s are necessary? > > I actually think this scheme is racy. There is a reason we > have to hold the socket lock when doing memory scheduling. > Two threads can get in there and say "hey I have enough space > already" even though only enough space is allocated for one > of their requests. > > What did I miss? :) > I believe you are right, and in fact was about to post a "dont look at this patch" since it doesnt help the multicast reception at all, I redone tests more carefuly and got nothing but noise. We have a cache line ping pong mess here, and need more thinking. I rewrote Kenny prog to use non blocking sockets. Receivers are doing : int delay = 50; fcntl(s, F_SETFL, O_NDELAY); while(1) { struct sockaddr_in from; socklen_t fromlen = sizeof(from); res = recvfrom(s, buf, 1000, 0, (struct sockaddr*)&from, &fromlen); if (res == -1) { delay++; usleep(delay); continue; } if (delay > 40) delay--; ++npackets; With this litle user space change and 8 receivers on my dual quad core, softirqd only takes 8% of one cpu and no drops at all (instead of 100% cpu and 30% drops) So this is definitly a problem mixing scheduler cache line ping pongs with network stack cache line ping pongs. We could reorder fields so that fewer cache lines are touched by the softirq processing, I tried this but still got packet drops. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists