lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 05 Oct 2012 12:35:43 -0700 From: Rick Jones <rick.jones2@...com> To: Eric Dumazet <eric.dumazet@...il.com> CC: Herbert Xu <herbert@...dor.apana.org.au>, David Miller <davem@...emloft.net>, netdev <netdev@...r.kernel.org>, Jesse Gross <jesse@...ira.com> Subject: Re: [RFC] GRO scalability On 10/05/2012 12:00 PM, Eric Dumazet wrote: > On Fri, 2012-10-05 at 11:16 -0700, Rick Jones wrote: > > Some remarks : > > 1) I use some 40Gbe links, thats probably why I try to improve things ;) Path length before workarounds :) > 2) benefit of GRO can be huge, and not only for the ACK avoidance > (other tricks could be done for ACK avoidance in the stack) Just how much code path is there between NAPI and the socket?? (And I guess just how much combining are you hoping for?) > 3) High speeds probably need multiqueue device, and each queue has its > own GRO unit. > > For example on a 40Gbe, 8 queues -> 5Gbps per queue (about 400k > packets/sec) > > Lets say we allow no more than 1ms of delay in GRO, OK. That means we can ignore HPC and FSI because they wouldn't tolerate that kind of added delay anyway. I'm not sure if that also then eliminates the networked storage types. > this means we could have about 400 packets in the GRO queue (assuming > 1500 bytes packets) How many flows are you going to have entering via that queue? And just how well "shuffled" will the segments of those flows be? That is what it all comes down to right? How many (active) flows and how well shuffled they are. If the flows aren't well shuffled, you can get away with a smallish coalescing context. If they are perfectly shuffled and greater in number than your delay allowance you get right back to square with all the overhead of GRO attempts with none of the benefit. If the flow count is < 400 to allow a decent shot at a non-zero combining rate on well shuffled flows with the 400 packet limit, then that means each flow is >= 12.5 Mbit/s on average at 5 Gbit/s aggregated. And I think you then get two segments per flow aggregated at a time. Is that consistent with what you expect to be the characteristics of the flows entering via that queue? rick jones -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists