linux-kernel - Re: [SchedulerWakeupLatency] Skipping Idle Cores and CPU Search

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c1b24dd5-dce9-61ed-baba-a70f08276bf5@arm.com>
Date:   Mon, 20 Jul 2020 10:47:42 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     chris hyser <chris.hyser@...cle.com>,
        Parth Shah <parth@...ux.ibm.com>,
        Patrick Bellasi <patrick.bellasi@...bug.net>,
        LKML <linux-kernel@...r.kernel.org>
Cc:     Ingo Molnar <mingo@...hat.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Paul Turner <pjt@...gle.com>, Ben Segall <bsegall@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jonathan Corbet <corbet@....net>,
        Dhaval Giani <dhaval.giani@...cle.com>,
        Josef Bacik <jbacik@...com>
Subject: Re: [SchedulerWakeupLatency] Skipping Idle Cores and CPU Search

On 10/07/2020 01:08, chris hyser wrote:

[...]

>> D) Desired behavior:
> 
> Reduce the maximum wake-up latency of designated CFS tasks by skipping
> some or all of the idle CPU and core searches by setting a maximum idle
> CPU search value (maximum loop iterations).
> 
> Searching 'ALL' as the maximum would be the default and implies the
> current code path which may or may not search up to ALL. Searching 0
> would result in the least latency (shown with experimental results to be
> included if/when patchset goes up). One of the considerations is that
> the maximum length of the search is a function of the size of the LLC
> scheduling domain and this is platform dependent. Whether 'some', i.e. a
> numerical value limiting the search can be used to "normalize" this
> latency across differing scheduling domain sizes is under investigation.
> Clearly differing hardware will have many other significant differences,
> but in different sized and dynamically sized VMs running on fleets of
> common HW this may be interesting.

I assume that this task-specific feature could coexists in
select_idle_core() and select_idle_cpu() with the already existing
runtime heuristics (test_idle_cores() and the two sched features
mentioned under E/F) to reduce the idle CPU search space on a busy system.

>> E/F) Existing knobs (and limitations):
> 
> There are existing sched_feat: SIS_AVG_CPU, SIS_PROP that attempt to
> short circuit the idle cpu search path in select_idle_cpu() based on
> estimations of the current costs of searching. Neither provides a means

[...]

>> H) Range Analysis:
> 
> The knob is a positive integer representing "max number of CPUs to
> search". The default would be 'ALL' which could be translated as
> INT_MAX. '0 searches' translates to 0. Other values represent a max
> limit on the search, in this case iterations of a for loop.

IMHO the opposite use case for this feature (favour high throughput over
short wakeup latency (Facebook) is already cured by the changes
introduced by commit 10e2f1acd010 ("sched/core: Rewrite and improve
select_idle_siblings()"), i.e. with the current implementation of sis().

It seems that they don't need an additional per-task feature on top of
the default system-wide runtime heuristics.

[...]