lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 14 May 2020 20:51:24 +0000
From:   "Gruza, Agata" <agata.gruza@...el.com>
To:     "vpillai@...italocean.com" <vpillai@...italocean.com>,
        "naravamudan@...italocean.com" <naravamudan@...italocean.com>,
        "jdesfossez@...italocean.com" <jdesfossez@...italocean.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        "mingo@...nel.org" <mingo@...nel.org>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "pjt@...gle.com" <pjt@...gle.com>,
        "torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>
CC:     "vpillai@...italocean.com" <vpillai@...italocean.com>,
        "fweisbec@...il.com" <fweisbec@...il.com>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "kerrnel@...gle.com" <kerrnel@...gle.com>,
        "pauld@...hat.com" <pauld@...hat.com>,
        "aaron.lwe@...il.com" <aaron.lwe@...il.com>,
        "aubrey.intel@...il.com" <aubrey.intel@...il.com>,
        "Li, Aubrey" <aubrey.li@...ux.intel.com>,
        "valentin.schneider@....com" <valentin.schneider@....com>,
        "mgorman@...hsingularity.net" <mgorman@...hsingularity.net>,
        "pawan.kumar.gupta@...ux.intel.com" 
        <pawan.kumar.gupta@...ux.intel.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "joelaf@...gle.com" <joelaf@...gle.com>,
        "joel@...lfernandes.org" <joel@...lfernandes.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: FW: [RFC PATCH 00/13] Core scheduling v5



-----Original Message-----
From: linux-kernel-owner@...r.kernel.org <linux-kernel-owner@...r.kernel.org> On Behalf Of Ning, Hongyu
Sent: Friday, May 8, 2020 8:40 PM
To: vpillai@...italocean.com; naravamudan@...italocean.com; jdesfossez@...italocean.com; peterz@...radead.org; Tim Chen <tim.c.chen@...ux.intel.com>; mingo@...nel.org; tglx@...utronix.de; pjt@...gle.com; torvalds@...ux-foundation.org
Cc: vpillai@...italocean.com; fweisbec@...il.com; keescook@...omium.org; kerrnel@...gle.com; pauld@...hat.com; aaron.lwe@...il.com; aubrey.intel@...il.com; Li, Aubrey <aubrey.li@...ux.intel.com>; valentin.schneider@....com; mgorman@...hsingularity.net; pawan.kumar.gupta@...ux.intel.com; pbonzini@...hat.com; joelaf@...gle.com; joel@...lfernandes.org; linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 00/13] Core scheduling v5


- Test environment:
Intel Xeon Server platform
CPU(s):              192
On-line CPU(s) list: 0-191
Thread(s) per core:  2
Core(s) per socket:  48
Socket(s):           2
NUMA node(s):        4

- Kernel under test: 
Core scheduling v5 base
https://github.com/digitalocean/linux-coresched/tree/coresched/v5-v5.5.y

- Test set based on sysbench 1.1.0-bd4b418:
A: sysbench cpu in cgroup cpu 1 + sysbench mysql in cgroup mysql 1 (192 workload tasks for each cgroup)
B: sysbench cpu in cgroup cpu 1 + sysbench cpu in cgroup cpu 2 + sysbench mysql in cgroup mysql 1 + sysbench mysql in cgroup mysql 2 (192 workload tasks for each cgroup)

- Test results briefing:
1 Good results:
1.1 For test set A, coresched could achieve same or better performance compared to smt_off, for both cpu workload and sysbench workload
1.2 For test set B, cpu workload, coresched could achieve better performance compared to smt_off

2 Bad results:
2.1 For test set B, mysql workload, coresched performance is lower than smt_off, potential fairness issue between cpu workloads and mysql workloads
2.2 For test set B, cpu workload, potential fairness issue between 2 cgroups cpu workloads

- Test results:
Note: test results in following tables are Tput normalized to default baseline

-- Test set A Tput normalized results:
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+
|                    | ****   | default   | coresched   | smt_off   | ***   | default     | coresched     | smt_off     |
+====================+========+===========+=============+===========+===
+====+=============+===============+=============+
| cgroups            | ****   | cg cpu 1  | cg cpu 1    | cg cpu 1  | ***   | cg mysql 1  | cg mysql 1    | cg mysql 1  |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+
| sysbench workload  | ****   | cpu       | cpu         | cpu       | ***   | mysql       | mysql         | mysql       |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+
| 192 tasks / cgroup | ****   | 1         | 0.95        | 0.54      | ***   | 1           | 0.92          | 0.97        |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+

-- Test set B Tput normalized results:
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+
|                    | ****   | default   | coresched   | smt_off   | ***   | default     | coresched     | smt_off     | **   | default     | coresched     | smt_off     | *   | default     | coresched     | smt_off     |
+====================+========+===========+=============+===========+===
+====+=============+===============+=============+======+=============+=
+==============+=============+=====+=============+===============+======
+=======+
| cgroups            | ****   | cg cpu 1  | cg cpu 1    | cg cpu 1  | ***   | cg cpu 2    | cg cpu 2      | cg cpu 2    | **   | cg mysql 1  | cg mysql 1    | cg mysql 1  | *   | cg mysql 2  | cg mysql 2    | cg mysql 2  |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+
| sysbench workload  | ****   | cpu       | cpu         | cpu       | ***   | cpu         | cpu           | cpu         | **   | mysql       | mysql         | mysql       | *   | mysql       | mysql         | mysql       |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+
| 192 tasks / cgroup | ****   | 1         | 0.9         | 0.47      | ***   | 1           | 1.32          | 0.66        | **   | 1           | 0.42          | 0.89        | *   | 1           | 0.42          | 0.89        |
+--------------------+--------+-----------+-------------+-----------+-------+-------------+---------------+-------------+------+-------------+---------------+-------------+-----+-------------+---------------+-------------+


> On Date: Wed,  4 Mar 2020 16:59:50 +0000, vpillai <vpillai@...italocean.com> wrote:
> To: Nishanth Aravamudan <naravamudan@...italocean.com>, Julien 
> Desfossez <jdesfossez@...italocean.com>, Peter Zijlstra 
> <peterz@...radead.org>, Tim Chen <tim.c.chen@...ux.intel.com>, 
> mingo@...nel.org, tglx@...utronix.de, pjt@...gle.com, 
> torvalds@...ux-foundation.org
> CC: vpillai <vpillai@...italocean.com>, linux-kernel@...r.kernel.org, 
> fweisbec@...il.com, keescook@...omium.org, kerrnel@...gle.com, Phil 
> Auld <pauld@...hat.com>, Aaron Lu <aaron.lwe@...il.com>, Aubrey Li 
> <aubrey.intel@...il.com>, aubrey.li@...ux.intel.com, Valentin 
> Schneider <valentin.schneider@....com>, Mel Gorman 
> <mgorman@...hsingularity.net>, Pawan Gupta 
> <pawan.kumar.gupta@...ux.intel.com>, Paolo Bonzini 
> <pbonzini@...hat.com>, Joel Fernandes <joelaf@...gle.com>, 
> joel@...lfernandes.org
> 
> 
> Fifth iteration of the Core-Scheduling feature.
> 
> Core scheduling is a feature that only allows trusted tasks to run 
> concurrently on cpus sharing compute resources(eg: hyperthreads on a 
> core). The goal is to mitigate the core-level side-channel attacks 
> without requiring to disable SMT (which has a significant impact on 
> performance in some situations). So far, the feature mitigates 
> user-space to user-space attacks but not user-space to kernel attack, 
> when one of the hardware thread enters the kernel (syscall, interrupt etc).
> 
> By default, the feature doesn't change any of the current scheduler 
> behavior. The user decides which tasks can run simultaneously on the 
> same core (for now by having them in the same tagged cgroup). When a 
> tag is enabled in a cgroup and a task from that cgroup is running on a 
> hardware thread, the scheduler ensures that only idle or trusted tasks 
> run on the other sibling(s). Besides security concerns, this feature 
> can also be beneficial for RT and performance applications where we 
> want to control how tasks make use of SMT dynamically.
> 
> This version was focusing on performance and stability. Couple of 
> crashes related to task tagging and cpu hotplug path were fixed.
> This version also improves the performance considerably by making task 
> migration and load balancing coresched aware.
> 
> In terms of performance, the major difference since the last iteration 
> is that now even IO-heavy and mixed-resources workloads are less 
> impacted by core-scheduling than by disabling SMT. Both host-level and 
> VM-level benchmarks were performed. Details in:
> https://lkml.org/lkml/2020/2/12/1194
> https://lkml.org/lkml/2019/11/1/269
> 
> v5 is rebased on top of 5.5.5(449718782a46) 
> https://github.com/digitalocean/linux-coresched/tree/coresched/v5-v5.5
> .y
> 


----------------------------------------------------------------------
ABOUT:
----------------------------------------------------------------------
Hello,

Core scheduling is required to protect against leakage of sensitive 
data allocated on a sibling thread. Our goal is to measure performance 
impact of core scheduling across different workloads and show how it 
evolved over time. Below you will find data based on core-sched (v5). 
In attached PDF system configuration setup as well as further 
explanation of the findings.  

----------------------------------------------------------------------
BENCHMARKS:
----------------------------------------------------------------------
- hammerdb      : database benchmarking application
- sysbench-cpu	: multi-threaded cpu benchmark
- sysbench-mysql: multi-threaded benchmark that tests open source DBMS
- build-kernel	: benchmark that is used to build Linux kernel
 

----------------------------------------------------------------------      
PERFORMANCE IMPACT:
----------------------------------------------------------------------

+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| benchmark          | ****   | # of cgroups | overcommit  | baseline + smt_on | coresched + smt_on | baseline + smt_off   |
+====================+========+==============+=============+===================+====================+======================+
| hammerdb           | ****   | 2cgroups     | 2x          | 1		       | 0.96		    | 0.87	           |	  
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| sysbench-cpu	     | ****   | 2cgroups     | 2x          | 1       	       | 0.95		    | 0.54		   |			
| sysbench-mysql     | ****   |     	     |             | 1     	       | 0.90		    | 0.47		   |
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| sysbench-cpu	     | ****   | 4cgroups     | 4x          | 1       	       | 0.90		    | 0.47		   |			
| sysbench-cpu       | ****   |     	     |             | 1     	       | 1.32		    | 0.66		   |
| sysbench-mycql     | ****   | 	     |             | 1       	       | 0.42		    | 0.89		   |			
| sysbench-mysql     | ****   |     	     |             | 1     	       | 0.42		    | 0.89	           |
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+
| kernel-build       | ****   | 2cgroups     | 0.5x        | 1		       | 1	            | 0.93	           |
|		     | ****   | 	     | 1x          | 1		       | 0.99		    | 0.92	           |
|		     | ****   |		     | 2x          | 1		       | 0.98		    | 0.91		   |
+--------------------+--------+--------------+-------------+-------------------+--------------------+----------------------+


----------------------------------------------------------------------
TAKE AWAYS:
----------------------------------------------------------------------
1. Core scheduling performs better than turning off HT.
2. Impact of core scheduling depends on the workload and thread 
scheduling intensity. 
3. Core scheduling requires cgroups. Tasks from the same cgroup are 
scheduled on the same core. 
4. Having core scheduling, in certain situations will introduce 
an uneven load distribution between multiple workload types. 
In such a case bias towards the cpu intensive workload is expected.  
5. Load balancing is not perfect. It needs more work.

Many thanks,

--Agata



Download attachment "LKML_core_sched_v5.5.y.pdf" of type "application/pdf" (360252 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ