netdev - Re: gso: Attempt to handle mega-GRO packets

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1383786208.2878.15.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Wed, 06 Nov 2013 17:03:28 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Herbert Xu <herbert@...dor.apana.org.au>
Cc:	Ben Hutchings <bhutchings@...arflare.com>,
	David Miller <davem@...emloft.net>,
	christoph.paasch@...ouvain.be, netdev@...r.kernel.org,
	hkchu@...gle.com, mwdalton@...gle.com, mst@...hat.com,
	Jason Wang <jasowang@...hat.com>
Subject: Re: gso: Attempt to handle mega-GRO packets

On Thu, 2013-11-07 at 08:36 +0800, Herbert Xu wrote:
> On Wed, Nov 06, 2013 at 07:01:10AM -0800, Eric Dumazet wrote:
> > Have you thought about arches having PAGE_SIZE=65536, and how bad it is
> > to use a full page per network frame ? It is lazy and x86 centered.
> 
> So instead if we were sending a full 64K packet on such an arch to
> another guest, we'd now chop it up into 1.5K chunks and reassemble them.
> 

Yep, and speed is now better than before the patches.

I understand you do not believe it. But this is the truth.

And now your guest can receive a bunch of small UDP frames, without
having to drop them because sk->rcvbuf limit is hit.

> > So after our patches, we now have an optimal situation, even on these
> > arches.
> 
> Optimal only for physical incoming packets with no jumbo frames.

Have you actually tested this ?

> 
> What's worse, I now realise that the coalesce thing isn't even
> guaranteed to work.  It probably works in your benchmarks because
> you're working with freshly allocated pages.
> 

Oh well.

> But once the system has been running for a while, I see nothing
> in the virtio_net code that tries to prevent fragmentation.  Once
> fragmentation sets in, you'll be back in the terrible situation
> that we were in prior to the coalesce patch.
> 

There is no fragmentation, since we allocate 32Kb pages.

Michael Dalton worked on a patch to add EWMA for auto sizing and a
private page_frag per virtio queue, instead of using the per cpu one.

On x86 :

- All offloads enabled (average packet size should be >> MTU-size)

net-next trunk w/ virtio_net prior to 2613af0ed (PAGE_SIZE bufs): 14179.17Gb/s
net-next trunk (MTU-size bufs):  13390.69Gb/s
net-next trunk + auto-tune - 14358.41Gb/s

- guest_tso4/guest_csum disabled (forces MTU-sized packets on receiver)

net-next trunk w/ virtio_net prior to 2613af0ed: 4059.49Gb/s
net-next trunk (MTU 1500- packet takes two bufs due to sizing bug): 4174.30Gb/s
net-next trunk (MTU 1480- packet fits in one buf): 6672.16Gb/s
net-next trunk + auto-tune (MTU 1500- fixed, packet uses one buf) - 6791.28Gb/s




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html