[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <732f3c01-a36f-4c9b-8273-a55aba9094d8@nbd.name>
Date: Wed, 23 Aug 2023 22:18:33 +0200
From: Felix Fietkau <nbd@....name>
To: Vincent Whitchurch <vincent.whitchurch@...s.com>, peppe.cavallaro@...com,
alexandre.torgue@...com, joabreu@...opsys.com, davem@...emloft.net,
kuba@...nel.org
Cc: kernel@...s.com, netdev@...r.kernel.org
Subject: Re: [PATCH net] net: stmmac: Use hrtimer for TX coalescing
On 20.11.20 16:02, Vincent Whitchurch wrote:
> This driver uses a normal timer for TX coalescing, which means that the
> with the default tx-usecs of 1000 microseconds the cleanups actually
> happen 10 ms or more later with HZ=100. This leads to very low
> througput with TCP when bridged to a slow link such as a 4G modem. Fix
> this by using an hrtimer instead.
>
> On my ARM platform with HZ=100 and the default TX coalescing settings
> (tx-frames 25 tx-usecs 1000), with "tc qdisc add dev eth0 root netem
> delay 60ms 40ms rate 50Mbit" run on the server, netperf's TCP_STREAM
> improves from ~5.5 Mbps to ~100 Mbps.
>
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@...s.com>
Based on tests by OpenWrt users, it seems that this one is causing a
significant performance regression caused by wasting lots of CPU cycles
re-arming the hrtimer on every single packet. More info:
https://github.com/openwrt/openwrt/issues/11676#issuecomment-1690492666
My suggestion for fixing this properly would be:
- keep a separate timestamp for last tx packet
- do not modify the timer if it's scheduled already
- in the timer function, check the last tx timestamp and re-arm the
timer if necessary.
This should significantly reduce the number of wasted CPU cycles, even
when accounting for the additional overhead of hrtimer vs regular timer.
- Felix
Powered by blists - more mailing lists