netdev - Re: [PATCH v4 0/10] bql: Byte Queue Limits

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4ED51656.3030802@hp.com>
Date:	Tue, 29 Nov 2011 09:28:54 -0800
From:	Rick Jones <rick.jones2@...com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	Dave Taht <dave.taht@...il.com>, Tom Herbert <therbert@...gle.com>,
	davem@...emloft.net, netdev@...r.kernel.org
Subject: Re: [PATCH v4 0/10] bql: Byte Queue Limits

On 11/28/2011 11:02 PM, Eric Dumazet wrote:
> Le mardi 29 novembre 2011 à 05:23 +0100, Dave Taht a écrit :
>>> In this test 100 netperf TCP_STREAMs were started to saturate the link.
>>> A single instance of a netperf TCP_RR was run with high priority set.
>>> Queuing discipline in pfifo_fast, NIC is e1000 with TX ring size set to
>>> 1024.  tps for the high priority RR is listed.
>>>
>>> No BQL, tso on: 3000-3200K bytes in queue: 36 tps
>>> BQL, tso on: 156-194K bytes in queue, 535 tps
>>
>>> No BQL, tso off: 453-454K bytes int queue, 234 tps
>>> BQL, tso off: 66K bytes in queue, 914 tps
>>
>>
>> Jeeze. Under what circumstances is tso a win? I've always
>> had great trouble with it, as some e1000 cards do it rather badly.

It is a win when one is sending bulk(ish) data and wish to avoid the 
trips up and down the protocol stack to save CPU cycles.

TSO is sometimes called "poor man's Jumbo Frames"  as it seeks to 
achieve the same goal - fewer trips down the protocol stack per KB of 
data transferred.

>> I assume these are while running at GigE speeds?
>>
>> What of 100Mbit? 10GigE? (I will duplicate your tests
>> at 100Mbit, but as for 10gigE...)
>>
>
> TSO on means a low priority 65Kbytes packet can be in TX ring right
> before the high priority packet. If you cant afford the delay, you lose.
>
> There is no mystery here.
>
> If you want low latencies :
> - TSO must be disabled so that packets are at most one ethernet frame.
> - You adjust BQL limit to small value
> - You even can lower MTU to get even more better latencies.
>
> If you want good throughput from your [10]GigE and low cpu cost, TSO
> should be enabled.

Outbound throughput. If you want good inbound throughput you want GRO/LRO.

> If you want to be smart, you could have a dynamic behavior :
>
> Let TSO on as long as no high priority low latency producer is running
> (if low latency packets are locally generated)

I'd probably leave that to the administrator rather than try to clutter 
things with additional logic.

*If* I were to add additional logic, I might have an interface 
communicate its "maximum TSO size" up the stack in a manner to too 
dissimilar from MTU.  That way one can control just how much time a 
TSO'd segment would consume.

rick jones
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html