netdev - Re: [RFC] GRO scalability

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <50730217.6020206@hp.com>
Date:	Mon, 08 Oct 2012 09:40:55 -0700
From:	Rick Jones <rick.jones2@...com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	Herbert Xu <herbert@...dor.apana.org.au>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>, Jesse Gross <jesse@...ira.com>
Subject: Re: [RFC] GRO scalability

On 10/05/2012 01:06 PM, Eric Dumazet wrote:
> On Fri, 2012-10-05 at 12:35 -0700, Rick Jones wrote:
>
>> Just how much code path is there between NAPI and the socket?? (And I
>> guess just how much combining are you hoping for?)
>>
>
> When GRO correctly works, you can save about 30% of cpu cycles, it
> depends...
>
> Doubling MAX_SKB_FRAGS (allowing 32+1 MSS per GRO skb instead of 16+1)
> gives an improvement as well...

OK, but how much of that 30% come from where?  Each coalesced segment is 
saving the cycles between NAPI and the socket.  Each avoided ACK is 
saving the cycles from TCP to the bottom of the driver and a (share of) 
transmit completion.


> I took this 1ms delay, but I never said it was a fixed value ;)
>
> Also remember one thing, this is the _max_ delay in case your napi
> handler is flooded. This almost never happen (tm)

We can still ignore the FSI types and probably the HPC types because 
they will insist on never happens (tm) :)


>
> Not sure what you mean by shuffle. We use a hash table to locate a flow,
> but we also have a LRU list to get the packets ordered by their entry in
> the 'GRO unit'.

Whe I say shuffle I mean something along the lines of interleave.  So, 
if we have four flows, 1-4, a perfect shuffle of their segments would be 
something like:

1 2 3 4 1 2 3 4 1 2 3 4

but not well shuffled might look like

1 1 3 2 3 2 4 4 4 1 3 2

rick

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html