linux-kernel - [RFC PATCH 0/9] Task latency-nice

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20190830174944.21741-1-subhra.mazumdar@oracle.com>
Date:   Fri, 30 Aug 2019 10:49:35 -0700
From:   subhra mazumdar <subhra.mazumdar@...cle.com>
To:     linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...hat.com, tglx@...utronix.de,
        steven.sistare@...cle.com, dhaval.giani@...cle.com,
        daniel.lezcano@...aro.org, vincent.guittot@...aro.org,
        viresh.kumar@...aro.org, tim.c.chen@...ux.intel.com,
        mgorman@...hsingularity.net, parth@...ux.ibm.com,
        patrick.bellasi@....com
Subject: [RFC PATCH 0/9] Task latency-nice

Introduce new per task property latency-nice for controlling scalability
in scheduler idle CPU search path. Valid latency-nice values are from 1 to
100 indicating 1% to 100% search of the LLC domain in select_idle_cpu. New
CPU cgroup file cpu.latency-nice is added as an interface to set and get.
All tasks in the same cgroup share the same latency-nice value. Using a
lower latency-nice value can help latency intolerant tasks e.g very short
running OLTP threads where full LLC search cost can be significant compared
to run time of the threads. The default latency-nice value is 5.

In addition to latency-nice, it also adds a new sched feature SIS_CORE to
be able to disable idle core search altogether which is costly and hurts
more than it helps in short running workloads.

Finally it also introduces a new per-cpu variable next_cpu to track
the limit of search so that every time search starts from where it ended.
This rotating search window over cpus in LLC domain ensures that idle
cpus are eventually found in case of high load.

Uperf pingpong on 2 socket, 44 core and 88 threads Intel x86 machine with
message size = 8k (higher is better):
threads baseline   latency-nice=5,SIS_CORE     latency-nice=5,NO_SIS_CORE 
8       64.66      64.38 (-0.43%)              64.79 (0.2%)
16      123.34     122.88 (-0.37%)             125.87 (2.05%)
32      215.18     215.55 (0.17%)              247.77 (15.15%)
48      278.56     321.6 (15.45%)              321.2 (15.3%)
64      259.99     319.45 (22.87%)             333.95 (28.44%)
128     431.1      437.69 (1.53%)              431.09 (0%)

subhra mazumdar (9):
  sched,cgroup: Add interface for latency-nice
  sched: add search limit as per latency-nice
  sched: add sched feature to disable idle core search
  sched: SIS_CORE to disable idle core search
  sched: Define macro for number of CPUs in core
  x86/smpboot: Optimize cpumask_weight_sibling macro for x86
  sched: search SMT before LLC domain
  sched: introduce per-cpu var next_cpu to track search limit
  sched: rotate the cpu search window for better spread

 arch/x86/include/asm/smp.h      |  1 +
 arch/x86/include/asm/topology.h |  1 +
 arch/x86/kernel/smpboot.c       | 17 ++++++++++++++++-
 include/linux/sched.h           |  1 +
 include/linux/topology.h        |  4 ++++
 kernel/sched/core.c             | 42 +++++++++++++++++++++++++++++++++++++++++
 kernel/sched/fair.c             | 34 +++++++++++++++++++++------------
 kernel/sched/features.h         |  1 +
 kernel/sched/sched.h            |  9 +++++++++
 9 files changed, 97 insertions(+), 13 deletions(-)

-- 
2.9.3