netdev - Re: [PATCH v4 0/10] bql: Byte Queue Limits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4ED4885F.8060309@intel.com>
Date:	Mon, 28 Nov 2011 23:23:11 -0800
From:	John Fastabend <john.r.fastabend@...el.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	Dave Taht <dave.taht@...il.com>, Tom Herbert <therbert@...gle.com>,
	"davem@...emloft.net" <davem@...emloft.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH v4 0/10] bql: Byte Queue Limits

On 11/28/2011 11:02 PM, Eric Dumazet wrote:
> Le mardi 29 novembre 2011 à 05:23 +0100, Dave Taht a écrit :
>>> In this test 100 netperf TCP_STREAMs were started to saturate the link.
>>> A single instance of a netperf TCP_RR was run with high priority set.
>>> Queuing discipline in pfifo_fast, NIC is e1000 with TX ring size set to
>>> 1024.  tps for the high priority RR is listed.
>>>
>>> No BQL, tso on: 3000-3200K bytes in queue: 36 tps
>>> BQL, tso on: 156-194K bytes in queue, 535 tps
>>
>>> No BQL, tso off: 453-454K bytes int queue, 234 tps
>>> BQL, tso off: 66K bytes in queue, 914 tps
>>
>>
>> Jeeze. Under what circumstances is tso a win? I've always
>> had great trouble with it, as some e1000 cards do it rather badly.
>>
>> I assume these are while running at GigE speeds?
>>
>> What of 100Mbit? 10GigE? (I will duplicate your tests
>> at 100Mbit, but as for 10gigE...)
>>
> 
> TSO on means a low priority 65Kbytes packet can be in TX ring right
> before the high priority packet. If you cant afford the delay, you lose.
> 
> There is no mystery here.
> 
> If you want low latencies :
> - TSO must be disabled so that packets are at most one ethernet frame. 
> - You adjust BQL limit to small value
> - You even can lower MTU to get even more better latencies.
> 
> If you want good throughput from your [10]GigE and low cpu cost, TSO
> should be enabled.
> 
> If you want to be smart, you could have a dynamic behavior :
> 
> Let TSO on as long as no high priority low latency producer is running
> (if low latency packets are locally generated)
> 
> 

I wonder if we should consider enabling TSO/GSO per queue or per traffic
class on devices that support this. At least in devices that support
multiple traffic classes it seems to be a common usage case to put bulk
storage traffic (iSCSI) on a traffic class and low latency traffic on a
separate traffic class, VoIP for example.

John.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html