[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3bacf19a-ae05-0410-f276-2b928b826af7@gmail.com>
Date: Mon, 26 Mar 2018 15:29:08 -0700
From: Florian Fainelli <f.fainelli@...il.com>
To: Tal Gilboa <talgi@...lanox.com>, netdev@...r.kernel.org
Cc: davem@...emloft.net, jaedon.shin@...il.com, pgynther@...gle.com,
opendmb@...il.com, michael.chan@...adcom.com, gospo@...adcom.com,
saeedm@...lanox.com
Subject: Re: [PATCH net-next 0/2] net: broadcom: Adaptive interrupt coalescing
On 03/26/2018 03:04 PM, Florian Fainelli wrote:
> On 03/26/2018 02:16 PM, Tal Gilboa wrote:
>> On 3/23/2018 4:19 AM, Florian Fainelli wrote:
>>> Hi all,
>>>
>>> This patch series adds adaptive interrupt coalescing for the Gigabit
>>> Ethernet
>>> drivers SYSTEMPORT and GENET.
>>>
>>> This really helps lower the interrupt count and system load, as
>>> measured by
>>> vmstat for a Gigabit TCP RX session:
>>
>> I don't see an improvement in system load, the opposite - 42% vs. 100%
>> for SYSTEMPORT and 85% vs. 100% for GENET. Both with the same bandwidth.
>
> Looks like I did not extract the correct data the load could spike in
> both cases (with and without net_dim) up to 100, but averaged over the
> transmission I see the following:
>
> GENET without:
> 1 0 0 1169568 0 25556 0 0 0 0 130079 62795 2
> 86 13 0 0
>
> GENET with:
> 1 0 0 1169536 0 25556 0 0 0 0 10566 10869 1
> 21 78 0 0
>
>> Am I missing something? Talking about bandwidth, I would expect 941Mb/s
>> (assuming this is TCP over IPv4). Do you know why the reduced interrupt
>> rate doesn't improve bandwidth?
>
> I am assuming that this comes down to a latency, still capturing some
> pcap files to analyze the TCP session with wireshark and see if that is
> indeed what is going on. The test machine is actually not that great
>
>> Also, any effect on the client side (you
>> mentioned enabling TX moderation for SYSTEMPORT)?
>
> Yes, on SYSTEMPORT, being the TCP IPv4 client, I have the following:
>
> SYSTEMPORT without:
> 2 0 0 191428 0 25748 0 0 0 0 86254 264 0 41
> 59 0 0
>
> SYSTEMPORT with:
> 3 0 0 190176 0 25748 0 0 0 0 45485 31332 0
> 100 0 0 0
>
> I don't get top to agree with these load results though but it looks
> like we just have the CPU spinning more, does not look like a win.
The problem appears to be the timeout selection on TX, ignoring it
completely allows us to keep the load average down while maintaining the
bandwidth. Looks like NAPI on TX already does a good job, so interrupt
mitigation on TX is not such a great idea actually...
Also, doing UDP TX tests shows that we can lower the interrupt count by
setting an appropriate tx-frames (as expected), but we won't be lowering
the CPU load since that is inherently a CPU intensive work. Past
tx-frames=64, the bandwidth completely drops because that would be 1/2
of the ring size.
--
Florian
Powered by blists - more mailing lists