lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 1 Jun 2014 00:55:55 +0000
From:	"" <>
To:	Eric Dumazet <>
CC:	"" <>,
	"" <>,
	"" <>,
	"" <>,
	"" <>,
	"" <>
Subject: RE: [PATCH v1 6/6] net: fec: Add software TSO support

From: Eric Dumazet <> Data: Saturday, May 31, 2014 12:22 AM
>To: Duan Fugang-B38611
>Cc: Li Frank-B20596;; ezequiel.garcia@...e-
>Subject: RE: [PATCH v1 6/6] net: fec: Add software TSO support
>On Fri, 2014-05-30 at 07:16 +0000, wrote:
>> Yes, test found it bounce all TX frames.
>> Use 2 descriptors to transfer one part, which bring more complicate
>> for driver. Of course, Performance must be better.
>How cpu handles misaligned 32bit accesses ?

You mean use extra descriptor for misaligned bytes ? Or attach the misaligned bytes to header descriptor ?
I don't do/test it.
Since cpu loading is light, I want to keep the current method. (one descriptor for header, one descriptor for mss)
>> Digression information:
>> Imx6dl FEC HW have bandwidth issue limit to 400 ~ 700Mbps. Current
>performance with TSO is 506Mbps, cpu loading is about 40%.
>> Later chips with FEC IP support byte alignment, such as imx6sx. On
>> imx6sx FEC, no SW TSO, tx bandwidth is 840~870Mbps, cpu loading is 100%,
>After the software TSO, tx bandwidth is 840Mbps, cpu loading is 48%.
>Since you have some cpu cycles, have you tried to always use the bounce
>thing, using one descriptor per MSS, instead of two ?
>(headers + payload)
>This might help to get better bandwidth, by lowering overhead on DMA and
>on NIC.
Imx6dl itself has hw bandwidth limitation. The performance can reach at 506Mbps is very amazing.

Imx6sx hw bandwidth no limitation. The above test result is that connecting to one FPGA board(iperf server), which rx speed may be limited.
So I connect to APPLE MAC book to test again, test result (applied the patches to our internal kernel 3.10.31):
High mem disable: tx bandwidth 942Mbps, cpu loading is 65%.
High mem enable: tx bandwidth 930Mbps, cpu loading is 100%. 
=> I don't know why kernel highmem config enable cause so much performance drop ???

For your above suggestion "using one descriptor per MSS, instead of two":
Yes, for imx6dl soc, we just do it like this.  For imx6sx soc FEC that support byte alignment,  so it also use one descriptor per MSS.

Thanks for your suggestion and response. Do you know why highmem cause much performance drop for SW TSO ?


Powered by blists - more mailing lists