[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121030165309.GA30483@phenom.dumpdata.com>
Date: Tue, 30 Oct 2012 12:53:10 -0400
From: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Ian Campbell <Ian.Campbell@...rix.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>
Subject: Re: [PATCH] net: allow configuration of the size of page in
__netdev_alloc_frag
On Wed, Oct 24, 2012 at 06:43:20PM +0200, Eric Dumazet wrote:
> On Wed, 2012-10-24 at 17:22 +0100, Ian Campbell wrote:
> > On Wed, 2012-10-24 at 16:21 +0100, Eric Dumazet wrote:
>
> > > If you really have such problems, why locally generated TCP traffic
> > > doesnt also have it ?
> >
> > I think it does. The reason I noticed the original problem was that ssh
> > to the machine was virtually (no pun intended) unusable.
> >
> > > Your patch doesnt touch sk_page_frag_refill(), does it ?
> >
> > That's right. It doesn't. When is (sk->sk_allocation & __GFP_WAIT) true?
> > Is it possible I'm just not hitting that case?
> >
>
> I hope not. GFP_KERNEL has __GFP_WAIT.
>
> > Is it possible that this only affects certain traffic patterns (I only
> > really tried ssh/scp and ping)? Or perhaps its just that the swiotlb is
> > only broken in one corner case and not the other.
>
> Could you try a netperf -t TCP_STREAM ?
For fun I did a couple of tests - I setup two machines (one r8168, the other
e1000e) and tried to do netperf/netserver. Both of them are running a baremetal
kernel and one of them has 'iommu=soft swiotlb=force' to simulate the worst
case. This is using v3.7-rc3.
The r8169 is booted without any arguments, the e1000e is using 'iommu=soft
swiotlb=force'.
So r8169 -> e1000e, I get ~940 (this is odd, I expected that the e1000e
on the recv side would be using the bounce buffer, but then I realized it
sets up using pci_alloc_coherent an 'dma' pool).
The other way - e1000e -> r8169 got me around ~128. So it is the sending
side that ends up using the bounce buffer and it slows down considerably.
I also swapped the machine that had e1000e with a tg3 - and got around
the same numbers.
So all of this points to the swiotlb and to just make sure that nothing
was amiss I wrote a little driver that would allocate a compound page,
setup DMA mapping, do some writes, sync and unmap the DMA page. And it works
correctly - so swiotlb (and the xen variant) work right just right.
Attached for your fun.
Then I decided to try v3.6.3, with the same exact parameters.. and
the problem went away.
The e1000e -> r8169 which got me around ~128, now gets ~940! Still
using the swiotlb bounce buffer.
>
> Because ssh use small packets, and small TCP packets dont use frags but
> skb->head.
>
> You mentioned a 70% drop of performance, but what test have you used
> exactly ?
Note, I did not provide any arguments to netperf, but it did pick the
test you wanted:
> netperf -H tst019
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to tst019.dumpdata.com (192.168.101.39) port 0 AF_INET
>
>
View attachment "dma_test.c" of type "text/plain" (5699 bytes)
Powered by blists - more mailing lists