[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a583c9fae69a4b2db8ddd70ed2c086c11456871a.camel@axis.com>
Date: Fri, 1 Sep 2023 11:31:59 +0000
From: Vincent Whitchurch <Vincent.Whitchurch@...s.com>
To: "nbd@....name" <nbd@....name>, "joabreu@...opsys.com"
<joabreu@...opsys.com>, Vincent Whitchurch <Vincent.Whitchurch@...s.com>,
"kuba@...nel.org" <kuba@...nel.org>, "davem@...emloft.net"
<davem@...emloft.net>, "alexandre.torgue@...com" <alexandre.torgue@...com>,
"peppe.cavallaro@...com" <peppe.cavallaro@...com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>, kernel
<kernel@...s.com>
Subject: Re: [PATCH net] net: stmmac: Use hrtimer for TX coalescing
On Wed, 2023-08-30 at 23:06 +0200, Felix Fietkau wrote:
> On 30.08.23 16:55, Vincent Whitchurch wrote:
> > I looked at it some more and the continuous postponing behaviour strikes
> > me as quite odd. For example, if you set tx-frames coalescing to 0 then
> > cleanups could happen much later than the specified tx-usecs period, in
> > the absence of RX traffic. Also, if we'd have to have a shared
> > timestamp between the callers of stmmac_tx_timer_arm() and the hrtimer
> > to preserve this continuous postponing behaviour, then we'd need to
> > introduce some locking between the timer expiry and those functions, to
> > avoid race conditions.
>
> I just spent some time digging through the history of the timer code,
> figuring out the intention behind the continuous postponing behavior.
>
> It's an interrupt mitigation scheme where DMA descriptors are configured
> to only generate a completion event every 25 packets, and the only
> purpose of the timer is to avoid keeping packets in the queue for too
> long after tx activity has stopped.
> Based on that design, I believe that the continuous postponing actually
> makes sense and the patches that eliminate it are misguided. When there
> is constant activity, there will be tx completion interrupts that
> trigger cleanup.
The tx-frames value (25) can be controlled from user space. If it is
set to zero then the driver should still coalesce interrupts based on
the interval specified in the tx-usecs setting, but the driver fails to
do so because it keeps postponing the cleanup and increasing the latency
of the cleanups far beyond the programmed period.
> That said, I did even more digging and I found out that the timer code
> was added at a time when the driver didn't even disable tx and rx
> interrupts individually, which means that it could not take advantage of
> interrupt mitigation via NAPI scheduling + IRQ disable/enable.
>
> I have a hunch that given the changes made to the driver over time, the
> timer based interrupt mitigation strategy might just be completely
> useless and actively harmful now. It certainly messes with things like
> TSQ and BQL in a nasty way.
>
> I suspect that the best and easiest way to deal with this issue is to
> simply rip out all that timer nonsense, rely on tx IRQs entirely and
> just let NAPI do its thing.
If you want an interrupt for every packet, you can turn off coalescing
by setting tx-frames to 1 and tx-usecs 0. Currently, the driver does
not turn off the timer even if tx-usecs is set to zero, but that is
trivially fixed. With such a fix and that setting, the result is a 30x
increase in the number of interrupts in a tx-only test.
And tx-frames 25 tx-usecs 0 is of course also bad for throughput since
cleanups will not happen when fewer than 25 frames are transmitted.
Powered by blists - more mailing lists