linux-kernel - Re: [RFC PATCH v2 00/17] Core scheduling v2

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190428093304.GA7393@gmail.com>
Date:   Sun, 28 Apr 2019 11:33:04 +0200
From:   Ingo Molnar <mingo@...nel.org>
To:     Aubrey Li <aubrey.intel@...il.com>
Cc:     Julien Desfossez <jdesfossez@...italocean.com>,
        Vineeth Remanan Pillai <vpillai@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Paul Turner <pjt@...gle.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        Subhra Mazumdar <subhra.mazumdar@...cle.com>,
        Frédéric Weisbecker <fweisbec@...il.com>,
        Kees Cook <keescook@...omium.org>,
        Greg Kerr <kerrnel@...gle.com>, Phil Auld <pauld@...hat.com>,
        Aaron Lu <aaron.lwe@...il.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [RFC PATCH v2 00/17] Core scheduling v2


* Aubrey Li <aubrey.intel@...il.com> wrote:

> > But what we are really interested in are throughput numbers under 
> > these three kernel variants, right?
> 
> These are sysbench events per second number, higher is better.
> 
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 1/1       508.5( 0.2%)    504.7( 1.1%) -0.8%     509.0( 0.2%)  0.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 2/2      1000.2( 1.4%)   1004.1( 1.6%)  0.4%     997.6( 1.2%) -0.3%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 4/4      1912.1( 1.0%)   1904.2( 1.1%) -0.4%    1914.9( 1.3%)  0.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 8/8      3753.5( 0.3%)   3748.2( 0.3%) -0.1%    3751.3( 0.4%) -0.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 16/16    7139.3( 2.4%)   7137.9( 1.8%) -0.0%    7049.2( 2.4%) -1.3%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 32/32   10899.0( 4.2%)  10780.3( 4.4%) -1.1%    10339.2( 9.6%) -5.1%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 64/64   15086.1(11.5%)  14262.0( 8.2%) -5.5%    11168.7(22.2%) -26.0%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 128/128 15371.9(22.0%)  14675.8(14.4%) -4.5%    10963.9(18.5%) -28.7%
> NA/AVX  baseline(std%)  coresched(std%) +/-     nosmt(std%) +/-
> 256/256 15990.8(22.0%)  12227.9(10.3%) -23.5%   10469.9(19.6%) -34.5%

So because I'm a big fan of presenting data in a readable fashion, here 
are your results, tabulated:

 #
 # Sysbench throughput comparison of 3 different kernels at different 
 # load levels, higher numbers are better:
 #

 .--------------------------------------|----------------------------------------------------------------.
 |  NA/AVX     vanilla-SMT    [stddev%] |coresched-SMT   [stddev%]   +/-  |   no-SMT    [stddev%]   +/-  |
 |--------------------------------------|----------------------------------------------------------------|
 |   1/1             508.5    [  0.2% ] |        504.7   [  1.1% ]   0.8% |    509.0    [  0.2% ]   0.1% |
 |   2/2            1000.2    [  1.4% ] |       1004.1   [  1.6% ]   0.4% |    997.6    [  1.2% ]   0.3% |
 |   4/4            1912.1    [  1.0% ] |       1904.2   [  1.1% ]   0.4% |   1914.9    [  1.3% ]   0.1% |
 |   8/8            3753.5    [  0.3% ] |       3748.2   [  0.3% ]   0.1% |   3751.3    [  0.4% ]   0.1% |
 |  16/16           7139.3    [  2.4% ] |       7137.9   [  1.8% ]   0.0% |   7049.2    [  2.4% ]   1.3% |
 |  32/32          10899.0    [  4.2% ] |      10780.3   [  4.4% ]  -1.1% |  10339.2    [  9.6% ]  -5.1% |
 |  64/64          15086.1    [ 11.5% ] |      14262.0   [  8.2% ]  -5.5% |  11168.7    [ 22.2% ] -26.0% |
 | 128/128         15371.9    [ 22.0% ] |      14675.8   [ 14.4% ]  -4.5% |  10963.9    [ 18.5% ] -28.7% |
 | 256/256         15990.8    [ 22.0% ] |      12227.9   [ 10.3% ] -23.5% |  10469.9    [ 19.6% ] -34.5% |
 '--------------------------------------|----------------------------------------------------------------'

One major thing that sticks out is that if we compare the stddev numbers 
to the +/- comparisons then it's pretty clear that the benchmarks are 
very noisy: in all but the last row stddev is actually higher than the 
measured effect.

So what does 'stddev' mean here, exactly? The stddev of multipe runs, 
i.e. measured run-to-run variance? Or is it some internal metric of the 
benchmark?

Thanks,

	Ingo