[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b5c16b8cd7874a8a979c3965984f768c@BLUPR03MB373.namprd03.prod.outlook.com>
Date: Fri, 30 May 2014 07:16:12 +0000
From: "fugang.duan@...escale.com" <fugang.duan@...escale.com>
To: Eric Dumazet <eric.dumazet@...il.com>
CC: "Frank.Li@...escale.com" <Frank.Li@...escale.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"ezequiel.garcia@...e-electrons.com"
<ezequiel.garcia@...e-electrons.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"shawn.guo@...aro.org" <shawn.guo@...aro.org>,
"bhutchings@...arflare.com" <bhutchings@...arflare.com>,
"stephen@...workplumber.org" <stephen@...workplumber.org>
Subject: RE: [PATCH v1 6/6] net: fec: Add software TSO support
From: Eric Dumazet <eric.dumazet@...il.com> Data: Friday, May 30, 2014 2:30 PM
>To: Duan Fugang-B38611
>Cc: Li Frank-B20596; davem@...emloft.net; ezequiel.garcia@...e-
>electrons.com; netdev@...r.kernel.org; shawn.guo@...aro.org;
>bhutchings@...arflare.com; stephen@...workplumber.org
>Subject: Re: [PATCH v1 6/6] net: fec: Add software TSO support
>
>On Fri, 2014-05-30 at 10:05 +0800, Fugang Duan wrote:
>
>> + if (((unsigned long) data) & FEC_ALIGNMENT) {
>> + memcpy(fep->tx_bounce[index], data, size);
>> + data = fep->tx_bounce[index];
>> + }
>
>Now you have SG support, maybe you could avoid copying the whole part, and
>only copy the beginning to reach the required alignment.
>
>Not sure its a win, as it requires 2 descriptors instead of one, and
>tso_count_descs() would have to be changed as well.
>
>Do you have an idea of how often this bouncing happens for normal
>workloads (ie not synthetic benchmarks) ?
>
>Even for non TSO frames, we have an 32bit aligned IP header, so the
>Ethernet header is not aligned to a 4 bytes boundary. I suspect this
>driver had to bounce all TX frames ?
>
Yes, test found it bounce all TX frames.
Use 2 descriptors to transfer one part, which bring more complicate for driver. Of course,
Performance must be better.
Digression information:
Imx6dl FEC HW have bandwidth issue limit to 400 ~ 700Mbps. Current performance with TSO is 506Mbps, cpu loading is about 40%.
Later chips with FEC IP support byte alignment, such as imx6sx. On imx6sx FEC, no SW TSO, tx bandwidth is 840~870Mbps, cpu loading is 100%,
After the software TSO, tx bandwidth is 840Mbps, cpu loading is 48%.
>I am wondering if most part of the TSO gain you have comes from this
>alignment problem you had before this patch and the SG one.
>
>It looks like you could tweak tcp_sendmsg() to make sure a fragment always
>start at a 16 bytes boundary or something...
>
>It should not really matter with iperf because it naturally generates
>aligned fragments (A new page starts with offset=0 and iperf uses 128KB
>writes...)
>
>diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index
>eb1dde37e678..be99af2d54e6 100644
>--- a/net/ipv4/tcp.c
>+++ b/net/ipv4/tcp.c
>@@ -1220,6 +1220,7 @@ new_segment:
> merge = false;
> }
>
>+ pfrag->offset = ALIGN(pfrag->offset, 16);
> copy = min_t(int, copy, pfrag->size - pfrag-
>>offset);
>
> if (!sk_wmem_schedule(sk, copy))
>
>
>
The solution with tweak for tcp_sendmsg is better, it don't bring any impact.
I don't know whether anybody agree the change ?
Thanks,
Andy
Powered by blists - more mailing lists