linux-kernel - Re: [PATCH] hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87h7jyxrgv.ffs@nanos.tec.linutronix.de>
Date:   Thu, 22 Apr 2021 17:35:12 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Lorenzo Colitti <lorenzo@...gle.com>
Cc:     Greg KH <gregkh@...uxfoundation.org>,
        Maciej Żenczykowski 
        <zenczykowski@...il.com>, Ingo Molnar <mingo@...nel.org>,
        Anna-Maria Behnsen <anna-maria@...utronix.de>,
        lkml <linux-kernel@...r.kernel.org>,
        mikael.beckius@...driver.com,
        Maciej Żenczykowski <maze@...gle.com>,
        Will Deacon <will@...nel.org>
Subject: Re: [PATCH] hrtimer: Update softirq_expires_next correctly after __hrtimer_get_next_event()

On Thu, Apr 22 2021 at 23:20, Lorenzo Colitti wrote:

> On Thu, Apr 22, 2021 at 9:08 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>> Just for comparison. In a VM I'm experimenting with right now the
>> reprogramming time is ~500ns which is still a lot of cycles, but
>> compared to 5us faster by an order of magnitude. And on the sane
>> machine bare metal its way faster and therefore less noticeable.
>
> FWIW, on this hardware, frtrace says that arming the arm64 architected
> timer takes 0.7us. Definitely better than 2-3us, but still not free.
> This is not a high-end desktop or server, but it's also not super
> slow, low-power hardware.
>
>>  * The transmit should only be run if no skb data has been sent for a
>>  * certain duration.
>>
>> which is useless word salad.
>
> You're the one who wrote that comment - see b1a31a5f5f27. You'll
> forgive me for being amused. :-)

Rightfully so! I still call it word salat :)

> Thanks for the history/analysis/suggestions. I think it's a fact that
> this is a regression in performance: this particular code has
> performed well for a couple of years now. The fact that the good
> performance only existed due to a correctness bug in the hrtimer code
> definitely does make it harder to argue that the regression should be
> reverted.

We tend to disagree about the naming conventions here, but we seem at
least to agree that reverting a fix for a correctness bug (which has way
worse implications than slowing down a gruesome driver) is not going to
happen.

> That said: if you have a fix for the double reprogram, then that fix
> should probably be applied? 0.5us is not free, and even if hrtimers
> aren't designed for frequent updates, touching the hardware twice as
> often does seem like a bad idea, since, as you point out, there's a
> *lot* of hardware that is slow.

That's an obvious improvement, but not a fix. And I checked quite some
hrtimer users and there are only a few which ever rearm an queued timer
and that happens infrequently.

> Separately, we're also going to look at making ncm better. (In defense
> of the original author, in 2014 I don't think anyone would even have
> dreamed of USB being fast enough for this to be a problem.) The first
> thing we're going to try to do is set the timer once per NTB instead
> of once per packet (so, 10x less). My initial attempt to do that
> causes the link to die after a while and I need to figure out why
> before I can send a patch up. I'm suspicious of the threading, which
> uses non-atomic variables (timer_force_tx, ncm->skb_tx_data) to
> synchronize control flow between the timer and the transmit function,
> which can presumably run on different CPUs. That seems wrong since
> either core could observe stale variables. But perhaps there are
> memory barriers that I'm not aware of.

Not that I see any.

> The idea of getting rid of the timer by doing aggregation based on
> transmit queue lengths seems like a much larger effort, but probably
> one that is required to actually improve performance substantially
> beyond what it is now.

I don't think it's a huge effort. netdev_xmit_more() should tell you
what you need to know.

Thanks,

        tglx