lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <cef0717a-e1da-c4a3-9fd0-ddb0914e3850@linux.ibm.com>
Date:   Thu, 5 Sep 2019 11:25:38 +0530
From:   Parth Shah <parth@...ux.ibm.com>
To:     subhra mazumdar <subhra.mazumdar@...cle.com>,
        linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...hat.com, tglx@...utronix.de,
        steven.sistare@...cle.com, dhaval.giani@...cle.com,
        daniel.lezcano@...aro.org, vincent.guittot@...aro.org,
        viresh.kumar@...aro.org, tim.c.chen@...ux.intel.com,
        mgorman@...hsingularity.net, patrick.bellasi@....com
Subject: Re: [RFC PATCH 0/9] Task latency-nice

Hi Subhra,

On 8/30/19 11:19 PM, subhra mazumdar wrote:
> Introduce new per task property latency-nice for controlling scalability
> in scheduler idle CPU search path. Valid latency-nice values are from 1 to
> 100 indicating 1% to 100% search of the LLC domain in select_idle_cpu. New
> CPU cgroup file cpu.latency-nice is added as an interface to set and get.
> All tasks in the same cgroup share the same latency-nice value. Using a
> lower latency-nice value can help latency intolerant tasks e.g very short
> running OLTP threads where full LLC search cost can be significant compared
> to run time of the threads. The default latency-nice value is 5.
> 
> In addition to latency-nice, it also adds a new sched feature SIS_CORE to
> be able to disable idle core search altogether which is costly and hurts
> more than it helps in short running workloads.
> 
> Finally it also introduces a new per-cpu variable next_cpu to track
> the limit of search so that every time search starts from where it ended.
> This rotating search window over cpus in LLC domain ensures that idle
> cpus are eventually found in case of high load.
> 
> Uperf pingpong on 2 socket, 44 core and 88 threads Intel x86 machine with
> message size = 8k (higher is better):
> threads baseline   latency-nice=5,SIS_CORE     latency-nice=5,NO_SIS_CORE 
> 8       64.66      64.38 (-0.43%)              64.79 (0.2%)
> 16      123.34     122.88 (-0.37%)             125.87 (2.05%)
> 32      215.18     215.55 (0.17%)              247.77 (15.15%)
> 48      278.56     321.6 (15.45%)              321.2 (15.3%)
> 64      259.99     319.45 (22.87%)             333.95 (28.44%)
> 128     431.1      437.69 (1.53%)              431.09 (0%)
> 

The result seems to be appealing with your experimental setup.
BTW, do you have any plans of load balancing as well based on latency niceness
of the tasks? It seems to be a more interesting case when we give pack the lower
latency sensitive tasks on fewer CPUs.

Also, do you see any workload results showing performance regression with NO_SIS_CORE?


Thanks,
Parth

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ