[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1429772159.25120.9.camel@intel.com>
Date: Thu, 23 Apr 2015 14:55:59 +0800
From: Huang Ying <ying.huang@...el.com>
To: "shli@...nel.org" <shli@...nel.org>
Cc: NeilBrown <neilb@...e.de>, LKML <linux-kernel@...r.kernel.org>,
LKP ML <lkp@...org>
Subject: [LKP] [RAID5] 878ee679279: -1.8% vmstat.io.bo, +40.5%
perf-stat.LLC-load-misses
FYI, we noticed the below changes on
git://neil.brown.name/md for-next
commit 878ee6792799e2f88bdcac329845efadb205252f ("RAID5: batch adjacent full stripe write")
testbox/testcase/testparams: lkp-st02/dd-write/300-5m-11HDD-RAID5-cfq-xfs-1dd
a87d7f782b47e030 878ee6792799e2f88bdcac3298
---------------- --------------------------
%stddev %change %stddev
\ | \
59035 ± 0% +18.4% 69913 ± 1% softirqs.SCHED
1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.num_objs
1330 ± 10% +17.4% 1561 ± 4% slabinfo.kmalloc-512.active_objs
305908 ± 0% -1.8% 300427 ± 0% vmstat.io.bo
1 ± 0% +100.0% 2 ± 0% vmstat.procs.r
8266 ± 1% -15.7% 6968 ± 0% vmstat.system.cs
14819 ± 0% -2.1% 14503 ± 0% vmstat.system.in
18.20 ± 6% +10.2% 20.05 ± 4% perf-profile.cpu-cycles.raid_run_ops.handle_stripe.handle_active_stripes.raid5d.md_thread
1.94 ± 9% +90.6% 3.70 ± 9% perf-profile.cpu-cycles.async_xor.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
0.00 ± 0% +Inf% 25.18 ± 3% perf-profile.cpu-cycles.handle_active_stripes.isra.45.raid5d.md_thread.kthread.ret_from_fork
0.00 ± 0% +Inf% 14.14 ± 4% perf-profile.cpu-cycles.async_copy_data.isra.42.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
1.79 ± 7% +102.9% 3.64 ± 9% perf-profile.cpu-cycles.xor_blocks.async_xor.raid_run_ops.handle_stripe.handle_active_stripes
3.09 ± 4% -10.8% 2.76 ± 4% perf-profile.cpu-cycles.get_active_stripe.make_request.md_make_request.generic_make_request.submit_bio
0.80 ± 14% +28.1% 1.02 ± 10% perf-profile.cpu-cycles.mutex_lock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
14.78 ± 6% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.async_copy_data.isra.38.raid_run_ops.handle_stripe.handle_active_stripes.raid5d
25.68 ± 4% -100.0% 0.00 ± 0% perf-profile.cpu-cycles.handle_active_stripes.isra.41.raid5d.md_thread.kthread.ret_from_fork
1.23 ± 5% +140.0% 2.96 ± 7% perf-profile.cpu-cycles.xor_sse_5_pf64.xor_blocks.async_xor.raid_run_ops.handle_stripe
2.62 ± 6% -95.6% 0.12 ± 33% perf-profile.cpu-cycles.analyse_stripe.handle_stripe.handle_active_stripes.raid5d.md_thread
0.96 ± 9% +17.5% 1.12 ± 2% perf-profile.cpu-cycles.xfs_ilock.xfs_file_buffered_aio_write.xfs_file_write_iter.new_sync_write.vfs_write
1.461e+10 ± 0% -5.3% 1.384e+10 ± 1% perf-stat.L1-dcache-load-misses
3.688e+11 ± 0% -2.7% 3.59e+11 ± 0% perf-stat.L1-dcache-loads
1.124e+09 ± 0% -27.7% 8.125e+08 ± 0% perf-stat.L1-dcache-prefetches
2.767e+10 ± 0% -1.8% 2.717e+10 ± 0% perf-stat.L1-dcache-store-misses
2.352e+11 ± 0% -2.8% 2.287e+11 ± 0% perf-stat.L1-dcache-stores
6.774e+09 ± 0% -2.3% 6.62e+09 ± 0% perf-stat.L1-icache-load-misses
5.571e+08 ± 0% +40.5% 7.826e+08 ± 1% perf-stat.LLC-load-misses
6.263e+09 ± 0% -13.7% 5.407e+09 ± 1% perf-stat.LLC-loads
1.914e+11 ± 0% -4.2% 1.833e+11 ± 0% perf-stat.branch-instructions
1.145e+09 ± 2% -5.6% 1.081e+09 ± 0% perf-stat.branch-load-misses
1.911e+11 ± 0% -4.3% 1.829e+11 ± 0% perf-stat.branch-loads
1.142e+09 ± 2% -5.1% 1.083e+09 ± 0% perf-stat.branch-misses
1.218e+09 ± 0% +19.8% 1.46e+09 ± 0% perf-stat.cache-misses
2.118e+10 ± 0% -5.2% 2.007e+10 ± 0% perf-stat.cache-references
2510308 ± 1% -15.7% 2115410 ± 0% perf-stat.context-switches
39623 ± 0% +22.1% 48370 ± 1% perf-stat.cpu-migrations
4.179e+08 ± 40% +165.7% 1.111e+09 ± 35% perf-stat.dTLB-load-misses
3.684e+11 ± 0% -2.5% 3.592e+11 ± 0% perf-stat.dTLB-loads
1.232e+08 ± 15% +62.5% 2.002e+08 ± 27% perf-stat.dTLB-store-misses
2.348e+11 ± 0% -2.5% 2.288e+11 ± 0% perf-stat.dTLB-stores
3577297 ± 2% +8.7% 3888986 ± 1% perf-stat.iTLB-load-misses
1.035e+12 ± 0% -3.5% 9.988e+11 ± 0% perf-stat.iTLB-loads
1.036e+12 ± 0% -3.7% 9.978e+11 ± 0% perf-stat.instructions
594 ± 30% +130.3% 1369 ± 13% sched_debug.cfs_rq[0]:/.blocked_load_avg
17 ± 10% -28.2% 12 ± 23% sched_debug.cfs_rq[0]:/.nr_spread_over
210 ± 21% +42.1% 298 ± 28% sched_debug.cfs_rq[0]:/.tg_runnable_contrib
9676 ± 21% +42.1% 13754 ± 28% sched_debug.cfs_rq[0]:/.avg->runnable_avg_sum
772 ± 25% +116.5% 1672 ± 9% sched_debug.cfs_rq[0]:/.tg_load_contrib
8402 ± 9% +83.3% 15405 ± 11% sched_debug.cfs_rq[0]:/.tg_load_avg
8356 ± 9% +82.8% 15272 ± 11% sched_debug.cfs_rq[1]:/.tg_load_avg
968 ± 25% +100.8% 1943 ± 14% sched_debug.cfs_rq[1]:/.blocked_load_avg
16242 ± 9% -22.2% 12643 ± 14% sched_debug.cfs_rq[1]:/.avg->runnable_avg_sum
353 ± 9% -22.1% 275 ± 14% sched_debug.cfs_rq[1]:/.tg_runnable_contrib
1183 ± 23% +77.7% 2102 ± 12% sched_debug.cfs_rq[1]:/.tg_load_contrib
181 ± 8% -31.4% 124 ± 26% sched_debug.cfs_rq[2]:/.tg_runnable_contrib
8364 ± 8% -31.3% 5745 ± 26% sched_debug.cfs_rq[2]:/.avg->runnable_avg_sum
8297 ± 9% +81.7% 15079 ± 12% sched_debug.cfs_rq[2]:/.tg_load_avg
30439 ± 13% -45.2% 16681 ± 26% sched_debug.cfs_rq[2]:/.exec_clock
39735 ± 14% -48.3% 20545 ± 29% sched_debug.cfs_rq[2]:/.min_vruntime
8231 ± 10% +82.2% 15000 ± 12% sched_debug.cfs_rq[3]:/.tg_load_avg
1210 ± 14% +110.3% 2546 ± 30% sched_debug.cfs_rq[4]:/.tg_load_contrib
8188 ± 10% +82.8% 14964 ± 12% sched_debug.cfs_rq[4]:/.tg_load_avg
8132 ± 10% +83.1% 14890 ± 12% sched_debug.cfs_rq[5]:/.tg_load_avg
749 ± 29% +205.9% 2292 ± 34% sched_debug.cfs_rq[5]:/.blocked_load_avg
963 ± 30% +169.9% 2599 ± 33% sched_debug.cfs_rq[5]:/.tg_load_contrib
37791 ± 32% -38.6% 23209 ± 13% sched_debug.cfs_rq[6]:/.min_vruntime
693 ± 25% +132.2% 1609 ± 29% sched_debug.cfs_rq[6]:/.blocked_load_avg
10838 ± 13% -39.2% 6587 ± 13% sched_debug.cfs_rq[6]:/.avg->runnable_avg_sum
29329 ± 27% -33.2% 19577 ± 10% sched_debug.cfs_rq[6]:/.exec_clock
235 ± 14% -39.7% 142 ± 14% sched_debug.cfs_rq[6]:/.tg_runnable_contrib
8085 ± 10% +83.6% 14848 ± 12% sched_debug.cfs_rq[6]:/.tg_load_avg
839 ± 25% +128.5% 1917 ± 18% sched_debug.cfs_rq[6]:/.tg_load_contrib
8051 ± 10% +83.6% 14779 ± 12% sched_debug.cfs_rq[7]:/.tg_load_avg
156 ± 34% +97.9% 309 ± 19% sched_debug.cpu#0.cpu_load[4]
160 ± 25% +64.0% 263 ± 16% sched_debug.cpu#0.cpu_load[2]
156 ± 32% +83.7% 286 ± 17% sched_debug.cpu#0.cpu_load[3]
164 ± 20% -35.1% 106 ± 31% sched_debug.cpu#2.cpu_load[0]
249 ± 15% +80.2% 449 ± 10% sched_debug.cpu#4.cpu_load[3]
231 ± 11% +101.2% 466 ± 13% sched_debug.cpu#4.cpu_load[2]
217 ± 14% +189.9% 630 ± 38% sched_debug.cpu#4.cpu_load[0]
71951 ± 5% +21.6% 87526 ± 7% sched_debug.cpu#4.nr_load_updates
214 ± 8% +146.1% 527 ± 27% sched_debug.cpu#4.cpu_load[1]
256 ± 17% +75.7% 449 ± 13% sched_debug.cpu#4.cpu_load[4]
209 ± 23% +98.3% 416 ± 48% sched_debug.cpu#5.cpu_load[2]
68024 ± 2% +18.8% 80825 ± 1% sched_debug.cpu#5.nr_load_updates
217 ± 26% +74.9% 380 ± 45% sched_debug.cpu#5.cpu_load[3]
852 ± 21% -38.3% 526 ± 22% sched_debug.cpu#6.curr->pid
lkp-st02: Core2
Memory: 8G
perf-stat.cache-misses
1.6e+09 O+-----O--O---O--O---O--------------------------------------------+
| O O O O O O O O O O |
1.4e+09 ++ |
1.2e+09 *+.*...* *..* * *...*..*...*..*...*..*...*..*...*..*
| : : : : : |
1e+09 ++ : : : : : : |
| : : : : : : |
8e+08 ++ : : : : : : |
| : : : : : : |
6e+08 ++ : : : : : : |
4e+08 ++ : : : : : : |
| : : : : : : |
2e+08 ++ : : : : : : |
| : : : |
0 ++-O------*----------*------*-------------------------------------+
perf-stat.L1-dcache-prefetches
1.2e+09 ++----------------------------------------------------------------+
*..*...* *..* * ..*.. ..*..*...*..*...*..*...*..*
1e+09 ++ : : : : *. *. |
| : : : :: : |
| : : : : : : O |
8e+08 O+ O: O :O O: O :O: O :O O O O O O O |
| : : : : : : |
6e+08 ++ : : : : : : |
| : : : : : : |
4e+08 ++ : : : : : : |
| : : : : : : |
| : : : : : : |
2e+08 ++ :: :: : : |
| : : : |
0 ++-O------*----------*------*-------------------------------------+
perf-stat.LLC-load-misses
1e+09 ++------------------------------------------------------------------+
9e+08 O+ O O O O O |
| O O O O |
8e+08 ++ O O O O O O |
7e+08 ++ |
| |
6e+08 *+..*..* *...* * *...*..*...*...*..*...*..*...*..*...*
5e+08 ++ : : : :: : |
4e+08 ++ : : : : : : |
| : : : : : : |
3e+08 ++ : : : : : : |
2e+08 ++ : : : : : : |
| : : : : : : |
1e+08 ++ : :: : |
0 ++--O------*---------*-------*--------------------------------------+
perf-stat.context-switches
3e+06 ++----------------------------------------------------------------+
| *...*..*... |
2.5e+06 *+.*...* *..* * : *..*... .*...*..*... .*
| : : : : : *. *. |
O O: O :O O: O :: : O O O O O O |
2e+06 ++ : : : :O: O :O O |
| : : : : : : |
1.5e+06 ++ : : : : : : |
| : : : : : : |
1e+06 ++ : : : : : : |
| : : : : : : |
| : : : : : : |
500000 ++ :: : : :: |
| : : : |
0 ++-O------*----------*------*-------------------------------------+
vmstat.system.cs
10000 ++------------------------------------------------------------------+
9000 ++ *...*.. |
*...*..* *...* * : *...*...*.. ..*..*...*.. ..*
8000 ++ : : : : : *. *. |
7000 O+ O: O O O: O : : : O O O O O O |
| : : : :O: O :O O |
6000 ++ : : : : : : |
5000 ++ : : : : : : |
4000 ++ : : : : : : |
| : : : : : : |
3000 ++ : : : : : : |
2000 ++ : : : : : : |
| : : :: :: |
1000 ++ : : : |
0 ++--O------*---------*-------*--------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
To reproduce:
apt-get install ruby
git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local job.yaml
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Ying Huang
View attachment "job.yaml" of type "text/plain" (3296 bytes)
View attachment "reproduce" of type "text/plain" (680 bytes)
_______________________________________________
LKP mailing list
LKP@...ux.intel.com
Powered by blists - more mailing lists