[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49F85BF1.1020501@cosmosbay.com>
Date: Wed, 29 Apr 2009 15:53:53 +0200
From: Eric Dumazet <dada1@...mosbay.com>
To: Andrew Gallatin <gallatin@...i.com>
CC: Herbert Xu <herbert@...dor.apana.org.au>,
David Miller <davem@...emloft.net>, brice@...i.com,
sgruszka@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment
Andrew Gallatin a écrit :
> Andrew Gallatin wrote:
>> For variety, I grabbed a different "slow" receiver. This is another
>> 2 CPU machine, but a dual-socket single-core opteron (Tyan S2895)
>>
>> processor : 0
>> vendor_id : AuthenticAMD
>> cpu family : 15
>> model : 37
>> model name : AMD Opteron(tm) Processor 252
> <...>
>> The sender was an identical machine running an ancient RHEL4 kernel
>> (2.6.9-42.ELsmp) and our downloadable (backported) driver.
>> (http://www.myri.com/ftp/pub/Myri10GE/myri10ge-linux.1.4.4.tgz)
>> I disabled LRO, on the sender.
>>
>> Binding the IRQ to CPU0, and the netserver to CPU1 I see 8.1Gb/s with
>> LRO and 8.0Gb/s with GRO.
>
> With the recent patch to fix idle CPU time accounting from LKML applied,
> it is again possible to trust netperf's service demand (based on %CPU).
> So here is raw netperf output for LRO and GRO, bound as above.
>
> TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> hail1-m.sw.myri.com (10.0.130.167) port 0 AF_INET : cpu bind
> Recv Send Send Utilization Service
> Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB
> us/KB
>
> LRO:
> 87380 65536 65536 60.00 8279.36 8.10 77.55 0.160 1.535
> GRO:
> 87380 65536 65536 60.00 8053.19 7.86 85.47 0.160 1.739
>
> The difference is bigger if you disable TCP timestamps (and thus shrink
> the packets headers down so they require fewer cachelines):
> LRO:
> 87380 65536 65536 60.02 7753.55 8.01 74.06 0.169 1.565
> GRO:
> 87380 65536 65536 60.02 7535.12 7.27 84.57 0.158 1.839
>
>
> As you can see, even though the raw bandwidth is very close, the
> service demand makes it clear that GRO is more expensive
> than LRO. I just wish I understood why.
>
What are "vmstat 1" ouputs on both tests ? Any difference on say... context switches ?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists