netdev - Re: [PATCH] myr10ge: again fix lro_gen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <49F85BF1.1020501@cosmosbay.com>
Date:	Wed, 29 Apr 2009 15:53:53 +0200
From:	Eric Dumazet <dada1@...mosbay.com>
To:	Andrew Gallatin <gallatin@...i.com>
CC:	Herbert Xu <herbert@...dor.apana.org.au>,
	David Miller <davem@...emloft.net>, brice@...i.com,
	sgruszka@...hat.com, netdev@...r.kernel.org
Subject: Re: [PATCH] myr10ge: again fix lro_gen_skb() alignment

Andrew Gallatin a écrit :
> Andrew Gallatin wrote:
>> For variety, I grabbed a different "slow" receiver.  This is another
>> 2 CPU machine, but a dual-socket single-core opteron (Tyan S2895)
>>
>> processor       : 0
>> vendor_id       : AuthenticAMD
>> cpu family      : 15
>> model           : 37
>> model name      : AMD Opteron(tm) Processor 252
> <...>
>> The sender was an identical machine running an ancient RHEL4 kernel
>> (2.6.9-42.ELsmp) and our downloadable (backported) driver.
>> (http://www.myri.com/ftp/pub/Myri10GE/myri10ge-linux.1.4.4.tgz)
>> I disabled LRO, on the sender.
>>
>> Binding the IRQ to CPU0, and the netserver to CPU1 I see 8.1Gb/s with
>> LRO and 8.0Gb/s with GRO.
> 
> With the recent patch to fix idle CPU time accounting from LKML applied,
> it is again possible to trust netperf's service demand (based on %CPU).
> So here is raw netperf output for LRO and GRO, bound as above.
> 
> TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
> hail1-m.sw.myri.com (10.0.130.167) port 0 AF_INET : cpu bind
> Recv   Send    Send                          Utilization       Service
> Demand
> Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
> Size   Size    Size     Time     Throughput  local    remote   local remote
> bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB  
> us/KB
> 
> LRO:
>  87380  65536  65536    60.00      8279.36   8.10     77.55    0.160 1.535
> GRO:
>  87380  65536  65536    60.00      8053.19   7.86     85.47    0.160 1.739
> 
> The difference is bigger if you disable TCP timestamps (and thus shrink
> the packets headers down so they require fewer cachelines):
> LRO:
>  87380  65536  65536    60.02      7753.55   8.01     74.06    0.169 1.565
> GRO:
>  87380  65536  65536    60.02      7535.12   7.27     84.57    0.158 1.839
> 
> 
> As you can see, even though the raw bandwidth is very close, the
> service demand makes it clear that GRO is more expensive
> than LRO.  I just wish I understood why.
> 

What are "vmstat 1" ouputs on both tests ? Any difference on say... context switches ?


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html