[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h7jyxrgv.ffs@nanos.tec.linutronix.de>
Date: Thu, 22 Apr 2021 17:35:12 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Lorenzo Colitti <lorenzo@...gle.com>
Cc: Greg KH <gregkh@...uxfoundation.org>,
Maciej Żenczykowski
<zenczykowski@...il.com>, Ingo Molnar <mingo@...nel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>,
lkml <linux-kernel@...r.kernel.org>,
mikael.beckius@...driver.com,
Maciej Żenczykowski <maze@...gle.com>,
Will Deacon <will@...nel.org>
Subject: Re: [PATCH] hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()
On Thu, Apr 22 2021 at 23:20, Lorenzo Colitti wrote:
> On Thu, Apr 22, 2021 at 9:08 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>> Just for comparison. In a VM I'm experimenting with right now the
>> reprogramming time is ~500ns which is still a lot of cycles, but
>> compared to 5us faster by an order of magnitude. And on the sane
>> machine bare metal its way faster and therefore less noticeable.
>
> FWIW, on this hardware, frtrace says that arming the arm64 architected
> timer takes 0.7us. Definitely better than 2-3us, but still not free.
> This is not a high-end desktop or server, but it's also not super
> slow, low-power hardware.
>
>> * The transmit should only be run if no skb data has been sent for a
>> * certain duration.
>>
>> which is useless word salad.
>
> You're the one who wrote that comment - see b1a31a5f5f27. You'll
> forgive me for being amused. :-)
Rightfully so! I still call it word salat :)
> Thanks for the history/analysis/suggestions. I think it's a fact that
> this is a regression in performance: this particular code has
> performed well for a couple of years now. The fact that the good
> performance only existed due to a correctness bug in the hrtimer code
> definitely does make it harder to argue that the regression should be
> reverted.
We tend to disagree about the naming conventions here, but we seem at
least to agree that reverting a fix for a correctness bug (which has way
worse implications than slowing down a gruesome driver) is not going to
happen.
> That said: if you have a fix for the double reprogram, then that fix
> should probably be applied? 0.5us is not free, and even if hrtimers
> aren't designed for frequent updates, touching the hardware twice as
> often does seem like a bad idea, since, as you point out, there's a
> *lot* of hardware that is slow.
That's an obvious improvement, but not a fix. And I checked quite some
hrtimer users and there are only a few which ever rearm an queued timer
and that happens infrequently.
> Separately, we're also going to look at making ncm better. (In defense
> of the original author, in 2014 I don't think anyone would even have
> dreamed of USB being fast enough for this to be a problem.) The first
> thing we're going to try to do is set the timer once per NTB instead
> of once per packet (so, 10x less). My initial attempt to do that
> causes the link to die after a while and I need to figure out why
> before I can send a patch up. I'm suspicious of the threading, which
> uses non-atomic variables (timer_force_tx, ncm->skb_tx_data) to
> synchronize control flow between the timer and the transmit function,
> which can presumably run on different CPUs. That seems wrong since
> either core could observe stale variables. But perhaps there are
> memory barriers that I'm not aware of.
Not that I see any.
> The idea of getting rid of the timer by doing aggregation based on
> transmit queue lengths seems like a much larger effort, but probably
> one that is required to actually improve performance substantially
> beyond what it is now.
I don't think it's a huge effort. netdev_xmit_more() should tell you
what you need to know.
Thanks,
tglx
Powered by blists - more mailing lists