linux-kernel - Re: [RESEND RFC PATCH V3] sched: Improve scalability of select_idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 2 Feb 2018 13:34:58 -0500
From:   Steven Sistare <steven.sistare@...cle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     subhra mazumdar <subhra.mazumdar@...cle.com>,
        linux-kernel@...r.kernel.org, mingo@...hat.com,
        dhaval.giani@...cle.com
Subject: Re: [RESEND RFC PATCH V3] sched: Improve scalability of
 select_idle_sibling using SMT balance

On 2/2/2018 12:39 PM, Steven Sistare wrote:
> On 2/2/2018 12:21 PM, Peter Zijlstra wrote:
>> On Fri, Feb 02, 2018 at 11:53:40AM -0500, Steven Sistare wrote:
>>> It might be interesting to add a tunable for the number of random choices to
>>> make, and clamp it at the max nr computed from avg_cost in select_idle_cpu.
>>
>> This needs a fairly complicated PRNG for it would need to visit each
>> possible CPU once before looping. A LFSR does that, but requires 2^n-1
>> elements and we have topology masks that don't match that.. The trivial
>> example is something with 6 cores.
> 
> Or keep it simple and accept the possibility of choosing the same candidate
> more than once.
> 
>>> Or, choose a random starting point and then search for nr sequential 
>>> candidates; possibly limited by a tunable.
>>
>> And this is basically what we already do. Except with the task-cpu
>> instead of a per-cpu rotor.
> 
> Righto.  Disregard this suggestion.

Actually, I take back my take back.  I suspect the primary benefit
of random selection is that it breaks up resonance states where
CPUs that are busy tend to stay busy, and CPUs that are idle tend
to stay idle, which is reinforced by starting the search at target =
last cpu ran.

Or, a quantitative argument: if sampling a single random CPU
gives better results (and the data says it does), then sampling
a random cpu and searching nr from it should give better results,
since it has nr - 1 more chances to find an idle CPU.

- Steve