lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150424121559.321677ce@notabene.brown>
Date:	Fri, 24 Apr 2015 12:15:59 +1000
From:	NeilBrown <neilb@...e.de>
To:	Huang Ying <ying.huang@...el.com>
Cc:	"shli@...nel.org" <shli@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>, LKP ML <lkp@...org>
Subject: Re: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5%
 perf-stat.LLC-load-misses

On Thu, 23 Apr 2015 14:55:59 +0800 Huang Ying <ying.huang@...el.com> wrote:

> FYI, we noticed the below changes on
> 
> git://neil.brown.name/md for-next
> commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write")

Hi,
 is there any chance that you could explain what some of this means?
There is lots of data and some very pretty graphs, but no explanation.

Which numbers are "good", which are "bad"?  Which is "worst".
What do the graphs really show? and what would we like to see in them?

I think it is really great that you are doing this testing and reporting the
results.  It's just so sad that I completely fail to understand them.

Thanks,
NeilBrown

> 
> 
> testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd
> 
> a87d7f782b47e030  878ee6792799e2f88bdcac3298  
> ----------------  --------------------------  
>          %stddev     %change         %stddev
>              \          |                \  
>      59035 ±  0%     +18.4%      69913 ±  1%  softirqs.SCHED
>       1330 ± 10%     +17.4%       1561 ±  4%  slabinfo.kmalloc-512.num_objs
>       1330 ± 10%     +17.4%       1561 ±  4%  slabinfo.kmalloc-512.active_objs
>     305908 ±  0%      -1.8%     300427 ±  0%  vmstat.io.bo
>          1 ±  0%    +100.0%          2 ±  0%  vmstat.procs.r
>       8266 ±  1%     -15.7%       6968 ±  0%  vmstat.system.cs
>      14819 ±  0%      -2.1%      14503 ±  0%  vmstat.system.in
>      18.20 ±  6%     +10.2%      20.05 ±  4%  perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread
>       1.94 ±  9%     +90.6%       3.70 ±  9%  perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
>       0.00 ±  0%      +Inf%      25.18 ±  3%  perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork
>       0.00 ±  0%      +Inf%      14.14 ±  4%  perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
>       1.79 ±  7%    +102.9%       3.64 ±  9%  perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes
>       3.09 ±  4%     -10.8%       2.76 ±  4%  perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio
>       0.80 ± 14%     +28.1%       1.02 ± 10%  perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
>      14.78 ±  6%    -100.0%       0.00 ±  0%  perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
>      25.68 ±  4%    -100.0%       0.00 ±  0%  perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork
>       1.23 ±  5%    +140.0%       2.96 ±  7%  perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe
>       2.62 ±  6%     -95.6%       0.12 ± 33%  perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread
>       0.96 ±  9%     +17.5%       1.12 ±  2%  perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
>  1.461e+10 ±  0%      -5.3%  1.384e+10 ±  1%  perf-stat.L1-dcache-load-misses
>  3.688e+11 ±  0%      -2.7%   3.59e+11 ±  0%  perf-stat.L1-dcache-loads
>  1.124e+09 ±  0%     -27.7%  8.125e+08 ±  0%  perf-stat.L1-dcache-prefetches
>  2.767e+10 ±  0%      -1.8%  2.717e+10 ±  0%  perf-stat.L1-dcache-store-misses
>  2.352e+11 ±  0%      -2.8%  2.287e+11 ±  0%  perf-stat.L1-dcache-stores
>  6.774e+09 ±  0%      -2.3%   6.62e+09 ±  0%  perf-stat.L1-icache-load-misses
>  5.571e+08 ±  0%     +40.5%  7.826e+08 ±  1%  perf-stat.LLC-load-misses
>  6.263e+09 ±  0%     -13.7%  5.407e+09 ±  1%  perf-stat.LLC-loads
>  1.914e+11 ±  0%      -4.2%  1.833e+11 ±  0%  perf-stat.branch-instructions
>  1.145e+09 ±  2%      -5.6%  1.081e+09 ±  0%  perf-stat.branch-load-misses
>  1.911e+11 ±  0%      -4.3%  1.829e+11 ±  0%  perf-stat.branch-loads
>  1.142e+09 ±  2%      -5.1%  1.083e+09 ±  0%  perf-stat.branch-misses
>  1.218e+09 ±  0%     +19.8%   1.46e+09 ±  0%  perf-stat.cache-misses
>  2.118e+10 ±  0%      -5.2%  2.007e+10 ±  0%  perf-stat.cache-references
>    2510308 ±  1%     -15.7%    2115410 ±  0%  perf-stat.context-switches
>      39623 ±  0%     +22.1%      48370 ±  1%  perf-stat.cpu-migrations
>  4.179e+08 ± 40%    +165.7%  1.111e+09 ± 35%  perf-stat.dTLB-load-misses
>  3.684e+11 ±  0%      -2.5%  3.592e+11 ±  0%  perf-stat.dTLB-loads
>  1.232e+08 ± 15%     +62.5%  2.002e+08 ± 27%  perf-stat.dTLB-store-misses
>  2.348e+11 ±  0%      -2.5%  2.288e+11 ±  0%  perf-stat.dTLB-stores
>    3577297 ±  2%      +8.7%    3888986 ±  1%  perf-stat.iTLB-load-misses
>  1.035e+12 ±  0%      -3.5%  9.988e+11 ±  0%  perf-stat.iTLB-loads
>  1.036e+12 ±  0%      -3.7%  9.978e+11 ±  0%  perf-stat.instructions
>        594 ± 30%    +130.3%       1369 ± 13%  sched_debug.cfs_rq[0]:/.blocked_load_avg
>         17 ± 10%     -28.2%         12 ± 23%  sched_debug.cfs_rq[0]:/.nr_spread_over
>        210 ± 21%     +42.1%        298 ± 28%  sched_debug.cfs_rq[0]:/.tg_runnable_contrib
>       9676 ± 21%     +42.1%      13754 ± 28%  sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum
>        772 ± 25%    +116.5%       1672 ±  9%  sched_debug.cfs_rq[0]:/.tg_load_contrib
>       8402 ±  9%     +83.3%      15405 ± 11%  sched_debug.cfs_rq[0]:/.tg_load_avg
>       8356 ±  9%     +82.8%      15272 ± 11%  sched_debug.cfs_rq[1]:/.tg_load_avg
>        968 ± 25%    +100.8%       1943 ± 14%  sched_debug.cfs_rq[1]:/.blocked_load_avg
>      16242 ±  9%     -22.2%      12643 ± 14%  sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum
>        353 ±  9%     -22.1%        275 ± 14%  sched_debug.cfs_rq[1]:/.tg_runnable_contrib
>       1183 ± 23%     +77.7%       2102 ± 12%  sched_debug.cfs_rq[1]:/.tg_load_contrib
>        181 ±  8%     -31.4%        124 ± 26%  sched_debug.cfs_rq[2]:/.tg_runnable_contrib
>       8364 ±  8%     -31.3%       5745 ± 26%  sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
>       8297 ±  9%     +81.7%      15079 ± 12%  sched_debug.cfs_rq[2]:/.tg_load_avg
>      30439 ± 13%     -45.2%      16681 ± 26%  sched_debug.cfs_rq[2]:/.exec_clock
>      39735 ± 14%     -48.3%      20545 ± 29%  sched_debug.cfs_rq[2]:/.min_vruntime
>       8231 ± 10%     +82.2%      15000 ± 12%  sched_debug.cfs_rq[3]:/.tg_load_avg
>       1210 ± 14%    +110.3%       2546 ± 30%  sched_debug.cfs_rq[4]:/.tg_load_contrib
>       8188 ± 10%     +82.8%      14964 ± 12%  sched_debug.cfs_rq[4]:/.tg_load_avg
>       8132 ± 10%     +83.1%      14890 ± 12%  sched_debug.cfs_rq[5]:/.tg_load_avg
>        749 ± 29%    +205.9%       2292 ± 34%  sched_debug.cfs_rq[5]:/.blocked_load_avg
>        963 ± 30%    +169.9%       2599 ± 33%  sched_debug.cfs_rq[5]:/.tg_load_contrib
>      37791 ± 32%     -38.6%      23209 ± 13%  sched_debug.cfs_rq[6]:/.min_vruntime
>        693 ± 25%    +132.2%       1609 ± 29%  sched_debug.cfs_rq[6]:/.blocked_load_avg
>      10838 ± 13%     -39.2%       6587 ± 13%  sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
>      29329 ± 27%     -33.2%      19577 ± 10%  sched_debug.cfs_rq[6]:/.exec_clock
>        235 ± 14%     -39.7%        142 ± 14%  sched_debug.cfs_rq[6]:/.tg_runnable_contrib
>       8085 ± 10%     +83.6%      14848 ± 12%  sched_debug.cfs_rq[6]:/.tg_load_avg
>        839 ± 25%    +128.5%       1917 ± 18%  sched_debug.cfs_rq[6]:/.tg_load_contrib
>       8051 ± 10%     +83.6%      14779 ± 12%  sched_debug.cfs_rq[7]:/.tg_load_avg
>        156 ± 34%     +97.9%        309 ± 19%  sched_debug.cpu#0.cpu_load[4]
>        160 ± 25%     +64.0%        263 ± 16%  sched_debug.cpu#0.cpu_load[2]
>        156 ± 32%     +83.7%        286 ± 17%  sched_debug.cpu#0.cpu_load[3]
>        164 ± 20%     -35.1%        106 ± 31%  sched_debug.cpu#2.cpu_load[0]
>        249 ± 15%     +80.2%        449 ± 10%  sched_debug.cpu#4.cpu_load[3]
>        231 ± 11%    +101.2%        466 ± 13%  sched_debug.cpu#4.cpu_load[2]
>        217 ± 14%    +189.9%        630 ± 38%  sched_debug.cpu#4.cpu_load[0]
>      71951 ±  5%     +21.6%      87526 ±  7%  sched_debug.cpu#4.nr_load_updates
>        214 ±  8%    +146.1%        527 ± 27%  sched_debug.cpu#4.cpu_load[1]
>        256 ± 17%     +75.7%        449 ± 13%  sched_debug.cpu#4.cpu_load[4]
>        209 ± 23%     +98.3%        416 ± 48%  sched_debug.cpu#5.cpu_load[2]
>      68024 ±  2%     +18.8%      80825 ±  1%  sched_debug.cpu#5.nr_load_updates
>        217 ± 26%     +74.9%        380 ± 45%  sched_debug.cpu#5.cpu_load[3]
>        852 ± 21%     -38.3%        526 ± 22%  sched_debug.cpu#6.curr->pid
> 
> lkp-st02: Core2
> Memory: 8G
> 
> 
> 
> 
>                                 perf-stat.cache-misses
> 
>   1.6e+09 O+-----O--O---O--O---O--------------------------------------------+
>           |                       O   O  O   O  O   O  O   O  O   O         |
>   1.4e+09 ++                                                                |
>   1.2e+09 *+.*...*      *..*      *      *...*..*...*..*...*..*...*..*...*..*
>           |      :      :  :      :      :                                  |
>     1e+09 ++      :    :    :    : :    :                                   |
>           |       :    :    :    : :    :                                   |
>     8e+08 ++      :    :    :    : :    :                                   |
>           |       :   :      :   :  :   :                                   |
>     6e+08 ++       :  :      :  :   :  :                                    |
>     4e+08 ++       : :        : :    : :                                    |
>           |        : :        : :    : :                                    |
>     2e+08 ++       : :        : :    : :                                    |
>           |         :          :      :                                     |
>         0 ++-O------*----------*------*-------------------------------------+
> 
> 
>                             perf-stat.L1-dcache-prefetches
> 
>   1.2e+09 ++----------------------------------------------------------------+
>           *..*...*      *..*      *        ..*..  ..*..*...*..*...*..*...*..*
>     1e+09 ++     :      :  :      :      *.     *.                          |
>           |      :     :    :     ::     :                                  |
>           |       :    :    :    : :     :                        O         |
>     8e+08 O+     O: O  :O  O:  O :O:  O :O   O  O   O  O   O  O             |
>           |       :   :      :   :  :   :                                   |
>     6e+08 ++      :   :      :   :  :   :                                   |
>           |        :  :      :  :   :   :                                   |
>     4e+08 ++       :  :      :  :   :  :                                    |
>           |        : :        : :    : :                                    |
>           |        : :        : :    : :                                    |
>     2e+08 ++        ::        ::     : :                                    |
>           |         :          :      :                                     |
>         0 ++-O------*----------*------*-------------------------------------+
> 
> 
>                               perf-stat.LLC-load-misses
> 
>   1e+09 ++------------------------------------------------------------------+
>   9e+08 O+     O   O  O   O  O                                              |
>         |                        O   O  O   O                               |
>   8e+08 ++                                     O   O   O  O   O  O          |
>   7e+08 ++                                                                  |
>         |                                                                   |
>   6e+08 *+..*..*      *...*      *      *...*..*...*...*..*...*..*...*..*...*
>   5e+08 ++      :     :   :      ::     :                                   |
>   4e+08 ++      :    :     :    : :    :                                    |
>         |        :   :     :    :  :   :                                    |
>   3e+08 ++       :   :      :  :   :   :                                    |
>   2e+08 ++        : :       :  :    : :                                     |
>         |         : :       : :     : :                                     |
>   1e+08 ++         :         ::      :                                      |
>       0 ++--O------*---------*-------*--------------------------------------+
> 
> 
>                               perf-stat.context-switches
> 
>     3e+06 ++----------------------------------------------------------------+
>           |                              *...*..*...                        |
>   2.5e+06 *+.*...*      *..*      *      :          *..*...  .*...*..*...  .*
>           |      :      :  :      :      :                 *.            *. |
>           O      O: O  :O  O:  O  ::    :       O   O  O   O  O   O         |
>     2e+06 ++      :    :    :    :O:  O :O   O                              |
>           |       :    :    :    : :    :                                   |
>   1.5e+06 ++      :   :      :   :  :   :                                   |
>           |        :  :      :   :  :  :                                    |
>     1e+06 ++       :  :      :  :   :  :                                    |
>           |        : :        : :    : :                                    |
>           |        : :        : :    : :                                    |
>    500000 ++        ::        : :    ::                                     |
>           |         :          :      :                                     |
>         0 ++-O------*----------*------*-------------------------------------+
> 
> 
>                                   vmstat.system.cs
> 
>   10000 ++------------------------------------------------------------------+
>    9000 ++                              *...*..                             |
>         *...*..*      *...*      *      :      *...*...*..  ..*..*...*..  ..*
>    8000 ++     :      :   :      :      :                 *.            *.  |
>    7000 O+     O:  O  O   O: O  : :    :       O   O   O  O   O  O          |
>         |       :    :     :    :O:  O :O   O                               |
>    6000 ++      :    :     :    : :    :                                    |
>    5000 ++       :   :     :   :   :   :                                    |
>    4000 ++       :   :      :  :   :  :                                     |
>         |        :  :       :  :   :  :                                     |
>    3000 ++        : :       : :     : :                                     |
>    2000 ++        : :       : :     : :                                     |
>         |         : :        ::     ::                                      |
>    1000 ++         :         :       :                                      |
>       0 ++--O------*---------*-------*--------------------------------------+
> 
> 
> 	[*] bisect-good sample
> 	[O] bisect-bad  sample
> 
> To reproduce:
> 
> 	apt-get install ruby
> 	git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
> 	cd lkp-tests
> 	bin/setup-local job.yaml # the job file attached in this email
> 	bin/run-local   job.yaml
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Ying Huang
> 


Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ