lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAKfTPtCds3SKL+EkM=TB83qm64uuFHQ1MXn70pcmMq6u9cvSGw@mail.gmail.com>
Date:	Tue, 9 Jun 2015 17:51:53 +0200
From:	Vincent Guittot <vincent.guittot@...aro.org>
To:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Rik van Riel <riel@...hat.com>,
	Preeti U Murthy <preeti@...ux.vnet.ibm.com>,
	Benjamin Segall <bsegall@...gle.com>,
	Morten Rasmussen <Morten.Rasmussen@....com>,
	Yuyang Du <yuyang.du@...el.com>
Cc:	Linaro Kernel Mailman List <linaro-kernel@...ts.linaro.org>
Subject: sched: some perf bench around per entity load tracking

Hi,

There are on going patches on the mailing list that modify the
scheduler load tracking area.
+Yuyang has rewritten the per entity load tracking:
https://lkml.org/lkml/2015/6/2/124
+Morten has also done some modification on the load tracking:
https://lkml.org/lkml/2015/5/13/448. Patches 01-12 modifies the load
tracking area. I haven't considered the end of the patchset which
implements the energy awareness, which it is out of the scope of the
tests i wanted to do.

In order to have a better idea of the impact of each patchset on the
performance of the scheduler, i have run some benches on a quad ARM
cortex A15 platform.

The list of bench that i have run:
-perf bench sched pipe -l 1000000
-hackbench --loops 400 --datasize 4096
-memcpy
-sysbench test=threads
-sysbench test=cpu
-ebizzy.

Here are the results:

main: mainline kernel  based on v4.1-rc6
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/ sha1
9ef7adfa7c0b548665ef3248228d548586e693ca
pelt: main + yuyang's patches
inv: main + morten's patches

I have also run the bench with and without CONFIG_SCHED_MC. The only
impact of this config on my platform is the setting of a llc sched
domain pointer when the config is set.

die: CONFIG_SCHED_MC is not. There is 1 of sched_domain (the die
level) and sched_domain flags : 0x102f
mc: CONFIG_SCHED_MC is set. There is 1 of sched_domain (the mc level)
and sched_domain flags : 0x22f

                         main+die  main+mc   pelt+die  pelt+mc
inv+die    inv+mc
sched pipe  ops/sec   45091.40     44.30%    83.40%    43.78%
83.35%    40.42%
            +/-           0.33%     3.79%     0.30%     2.63%
0.30%     0.60%
hackbench   duration      7.84     98.33%    99.27%    95.51%
99.03%    97.54%
            +/-           0.37%     0.88%     1.08%     0.61%
1.18%     1.13%
memcopy     MB/s       4950.47    102.76%   100.76%    99.03%
99.44%   102.59%
            +/-           4.09%     6.13%     4.98%     2.19%
5.30%     7.26%

sysbench test=threads
2 thrds/1 lock events  5891.50     91.81%    94.81%    88.99%
99.18%    91.15%
               +/-        0.39%     0.63%     0.92%     0.69%
0.43%     1.10%
3 thrds/1 lock events  4061.83     86.10%    90.44%    82.56%
100.59%    86.45%
               +/-        1.28%     2.08%     3.76%     1.11%
0.44%     1.87%
4 thrds/2 locks events 6203.83     86.19%    89.09%    83.05%
99.61%    86.26%
                +/-       1.69%    1.41%      2.78%     1.64%
0.88%     0.92%
5 thrds/2 locks events 4062.00    137.43%   130.77%   132.53%
93.67%   136.80%
                +/-       0.59%     0.89%     2.56%     2.06%
1.05%     1.29%
6 thrds/3 locks events 5531.00    159.52%    109.85%  151.88%
96.11%   159.00%
                +/-       1.59%     0.78%     1.76%     1.37%
2.72%     1.04%

ebizzy
1 thread   records/s   6040.50    100.68%    99.60%   101.05%
98.64%    97.42%
           +/-            1.97%     1.50%     1.75%     0.90%
1.66%     0.95%
2 threads  records/s   9278.50    100.59%   101.21%   100.64%
100.71%    99.05%
           +/-            2.82%     0.86%     0.59%     0.63%
0.88%     1.50%
3 threads  records/s  11205.33     99.75%   101.41%   100.98%
100.16%    97.64%
           +/-            2.76%     2.13%     2.30%     1.51%
3.26%     2.58%
4 threads  records/s   10970.00   102.78%    99.59%   102.00%
107.24%   106.10%
           +/-             3.39%    4.68%     3.63%     5.75%
4.07%     4.41%
5 threads  records/s   11716.50    95.57%    93.81%    96.36%
98.51%    96.81%
           +/-             3.52%    4.95%     4.50%     5.27%
4.28%     5.51%
6 threads  records/s   11209.33    99.42%   100.33%    97.86%
99.38%    95.75%
           +/-             3.57%    2.95%     5.16%     6.84%
3.70%     3.57%
7 threads  records/s   11204.50    99.55%    99.31%    95.73%
99.02%    96.55%
           +/-             4.54%    4.22%     5.39%     3.71%
5.36%     3.69%
8 threads  records/s   17210.83    99.57%   100.65%    99.80%
100.16%   100.37%
           +/-             2.01%    1.88%     1.22%     2.25%
2.86%     1.69%

I have skipped the results of sysbench cpu as they are "exactly" the
same with all kernels.

The 1st noticeable point is the impact of the LLC on the sched pipe
and on hackbench in a less extent

Then, the results don't show any clear performance advantage for 1 of
the 3 kernels.

I have just seen that Yuyang has sent some performance figures for his
patchset and AFAICT, there is no clear perf advantage for one version
of the kernel.

Have anyone else also run some bench of these patchsets ?

Regards,
Vincent
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ