linux-kernel - Re: RCU vs NOHZ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <051a4ae9-cae9-9ff1-d732-62cf8751dafd@joelfernandes.org>
Date:   Thu, 15 Sep 2022 23:40:08 -0400
From:   Joel Fernandes <joel@...lfernandes.org>
To:     Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>
Cc:     Frederic Weisbecker <fweisbec@...il.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, Boqun Feng <boqun.feng@...il.com>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>
Subject: Re: RCU vs NOHZ

On 9/15/2022 6:30 PM, Peter Zijlstra wrote:
>>>>> These deep idle states are only feasible during NOHZ idle, and the NOHZ
>>>>> path is already relatively expensive (which is offset by then mostly
>>>>> staying idle for a long while).
>>>>>
>>>>> Specifically my thinking was that when a CPU goes NOHZ it can splice
>>>>> it's callback list onto a global list (cmpxchg), and then the
>>>>> jiffy-updater CPU can look at and consume this global list (xchg).
>>>>>
>>>>> Before you say... but globals suck (they do), NOHZ already has a fair
>>>>> amount of global state, and as said before, it's offset by the CPU then
>>>>> staying idle for a fair while. If there is heavy contention on the NOHZ
>>>>> data, the idle governor is doing a bad job by selecting deep idle states
>>>>> whilst we're not actually idle for long.
>>>>>
>>>>> The above would remove the reason for RCU to inhibit NOHZ.
>>>>>
>>>>>
>>>>> Additionally; when the very last CPU goes idle (I think we know this
>>>>> somewhere, but I can't reaily remember where) we can insta-advance the
>>>>> QS machinery and run the callbacks before going (NOHZ) idle.
>>>>>
>>>>>
>>>>> Is there a reason this couldn't work? To me this seems like a much
>>>>> simpler solution than the whole rcu-cb thing.
>>>> To restate Joel's reply a bit...
>>>>
>>>> Maybe.
>>>>
>>>> Except that we need rcu_nocbs anyway for low latency and HPC applications.
>>>> Given that we have it, and given that it totally eliminates RCU-induced
>>>> idle ticks, how would it help to add cmpxchg-based global offloading?
>>> Because that nocb stuff isn't default enabled?
>> Last I checked, both RHEL and Fedora were built with CONFIG_RCU_NOCB_CPU=y.
>> And I checked Fedora just now.
>>
>> Or am I missing your point?
> I might be missing the point; but why did Joel have a talk if it's all
> default on?

It was not default on until recently for Intel ChromeOS devices. Also, my talk
is much more than just idle-ticks/NOCB. I am talking about delaying grace
periods by long amounts of times using various techniques, with data to show
improvements, and new tool rcutop to show the problems :-) The discussion of
ticks which disturb idle was more for background information for the audience
(Sorry if I was not clear about that).

thanks,

 - Joel