lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 31 Oct 2017 18:39:57 +0800 From: kernel test robot <xiaolong.ye@...el.com> To: Ross Zwisler <ross.zwisler@...ux.intel.com> Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Jan Kara <jack@...e.cz>, "Darrick J. Wong" <darrick.wong@...cle.com>, Theodore Ts'o <tytso@....edu>, Alexander Viro <viro@...iv.linux.org.uk>, Andreas Dilger <adilger.kernel@...ger.ca>, Christoph Hellwig <hch@....de>, Dan Williams <dan.j.williams@...el.com>, Dave Chinner <david@...morbit.com>, Ingo Molnar <mingo@...hat.com>, Jonathan Corbet <corbet@....net>, Matthew Wilcox <mawilcox@...rosoft.com>, Steven Rostedt <rostedt@...dmis.org>, "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>, Andrew Morton <akpm@...ux-foundation.org>, LKML <linux-kernel@...r.kernel.org>, lkp@...org Subject: [lkp-robot] [dax] 91d25ba8a6: fio.read_clat_90%_us +120% regression Greeting, FYI, we noticed a 120% regression of fio.read_clat_90%_us due to commit: commit: 91d25ba8a6b0d810dc844cebeedc53029118ce3e ("dax: use common 4k zero page for dax mmap reads") https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master in testcase: fio-basic on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory with following parameters: disk: 2pmem fs: xfs mount_option: dax runtime: 200s nr_task: 50% time_based: tb rw: rw bs: 4k ioengine: mmap test_size: 200G cpufreq_governor: performance test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user. test-url: https://github.com/axboe/fio Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml testcase/path_params/tbox_group/run: fio-basic/2pmem-xfs-dax-200s-50%-tb-rw-4k-mmap-200G-performance/lkp-hsw-ep6 e30331ff05f689f8 91d25ba8a6b0d810dc844cebee ---------------- -------------------------- 22.66 159% 58.69 fio.latency_20us% 5 120% 11 fio.read_clat_90%_us 2.26 87% 4.23 fio.read_clat_mean_us 3.04 ± 12% 86% 5.65 ± 3% fio.read_clat_stddev 6 ± 6% 78% 12 fio.read_clat_95%_us 10 ± 4% 59% 16 ± 5% fio.read_clat_99%_us 26.43 19% 31.48 fio.latency_2us% 4796 4876 fio.time.system_time 28491 ± 3% -5% 27060 fio.time.voluntary_context_switches 819 -10% 739 fio.time.user_time 14.38 -15% 12.25 fio.write_clat_mean_us 30 -25% 23 fio.write_clat_99%_us 7.66 -31% 5.29 ± 10% fio.write_clat_stddev 24 -35% 16 fio.write_clat_95%_us 22 -36% 14 ± 3% fio.write_clat_90%_us 6.417e+08 -31% 4.446e+08 fio.time.minor_page_faults 25.31 ± 3% -67% 8.23 ± 7% fio.latency_10us% 12.87 ± 10% -88% 1.53 ± 9% fio.latency_50us% 12.72 ± 3% -100% 0.06 ± 20% fio.latency_4us% 1785 ± 3% -10% 1615 vmstat.system.cs 551 10% 604 turbostat.Avg_MHz 123 8% 132 turbostat.PkgWatt 25.47 24.88 turbostat.RAMWatt 3300 ±129% -3e+03 0 latency_stats.avg.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath 4518312 ±155% -5e+06 0 latency_stats.avg.io_schedule.nfs_wait_on_request.nfs_writepage_setup.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 3300 ±129% -3e+03 0 latency_stats.max.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath 7064619 ±155% -7e+06 0 latency_stats.max.io_schedule.nfs_wait_on_request.nfs_writepage_setup.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 3300 ±129% -3e+03 0 latency_stats.sum.call_rwsem_down_write_failed.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath 9533 ± 11% -8e+03 1854 ± 26% latency_stats.sum.call_rwsem_down_read_failed.xfs_log_commit_cil.[xfs].__xfs_trans_commit.[xfs].xfs_trans_commit.[xfs].xfs_vn_update_time.[xfs].file_update_time.xfs_filemap_pfn_mkwrite.[xfs].do_wp_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault 13554938 ±155% -1e+07 0 latency_stats.sum.io_schedule.nfs_wait_on_request.nfs_writepage_setup.nfs_updatepage.nfs_write_end.generic_perform_write.nfs_file_write.__vfs_write.vfs_write.SyS_write.entry_SYSCALL_64_fastpath 27.72 ± 4% 61% 44.61 ± 28% perf-stat.iTLB-load-miss-rate% 1.47e+12 ± 3% 21% 1.785e+12 perf-stat.branch-instructions 0.15 15% 0.18 perf-stat.dTLB-store-miss-rate% 7.1e+12 ± 3% 14% 8.124e+12 perf-stat.instructions 1.823e+12 ± 3% 13% 2.053e+12 ± 3% perf-stat.dTLB-loads 1.839e+13 ± 3% 11% 2.045e+13 perf-stat.cpu-cycles 2.59 2.52 perf-stat.cpi 72.31 5% 75.81 perf-stat.node-store-miss-rate% 0.39 0.40 perf-stat.ipc 7016 -5% 6687 perf-stat.cpu-migrations 8.242e+09 -17% 6.882e+09 ± 4% perf-stat.branch-misses 34.55 -7% 32.26 perf-stat.cache-miss-rate% 7.286e+10 -16% 6.094e+10 ± 3% perf-stat.cache-references 943574 ± 3% -10% 847854 perf-stat.context-switches 3.135e+09 ± 5% -11% 2.805e+09 ± 7% perf-stat.dTLB-load-misses 9.382e+08 ± 8% -14% 8.111e+08 ± 10% perf-stat.iTLB-loads 9.986e+11 -15% 8.526e+11 perf-stat.dTLB-stores 0.17 ± 4% -21% 0.14 ± 4% perf-stat.dTLB-load-miss-rate% 2.517e+10 -22% 1.967e+10 ± 4% perf-stat.cache-misses 1.032e+10 ± 6% -22% 8.015e+09 perf-stat.node-load-misses 4.359e+09 ± 3% -29% 3.095e+09 perf-stat.node-store-misses 0.56 -31% 0.39 perf-stat.branch-miss-rate% 6.43e+08 -31% 4.459e+08 perf-stat.minor-faults 6.43e+08 -31% 4.459e+08 perf-stat.page-faults 1.668e+09 -41% 9.882e+08 ± 4% perf-stat.node-stores fio.read_clat_mean_us 4.5 +-+-----------------O--O------------------------O-O----O--------------+ O O O O O O O O O O O O O O O O O O O O O O 4 +-+ O O | 3.5 +-+ | | | 3 +-+ | 2.5 +-++. .+.. +.. +. .+.+.. | | : +..+.+..+.+..+ +.+ : +.+..+ : +..+.+. +.+..+.+..+.| 2 +-+ : : : : | 1.5 +-+ : : : : | | : : : : : | 1 +-+ : : : : | 0.5 +-+ : : : : | | : : | 0 +-+-------------------------------------------------------------------+ fio.read_clat_90__us 12 +-+-----------------O--O------------------------O--O----O--------------+ O O O O O O O O O O O O O O O O O O O O O O O O 10 +-+ | | | | | 8 +-+ | | | 6 +-++. .+.+.. .+.. +.. +.. .+..+. .+. | | : +..+.+. + +.+ : +..+.+ : +.+..+ +. +..+.+..| 4 +-+ : : : : | | : : : : : | |: : : : : | 2 +-+ : : : : | | : : | 0 +-+--------------------------------------------------------------------+ fio.read_clat_95__us 14 +-+--------------------------------------------------------------------+ | O O | 12 O-+O O O O O O O O O O O O O O O O O O O O O O O O O O | | 10 +-+ | | | 8 +-+ | | +. .+.+..+.+.. +.. +.. .+..+.+..+. .+.+..| 6 +-+: +..+.+. +.+ : +..+.+ : +.+..+ +. | | : : : : : | 4 +-+ : : : : | |: : : : : | 2 +-+ : : : : | | : : | 0 +-+--------------------------------------------------------------------+ fio.read_clat_99__us 20 +-+--------------------------------------------------------------------+ 18 +-+ O O O | | O O O 16 O-+ O O O O O O | 14 +-+O O O O O O O O O O O O O O O O | | | 12 +-+ +.. .+. | 10 +-++. .+. .+.+..+.+.. +.. .+. +.. .+.. + +.+..+.+. +..| 8 +-+: +. +. +.+ : +. + : + + | | : : : : : | 6 +-+ : : : : | 4 +-+ : : : : | |: : : : : | 2 +-+ : : | 0 +-+--------------------------------------------------------------------+ fio.write_clat_90__us 25 +-+--------------------------------------------------------------------+ | +.. .+.. +.. .+ +. .+. .+..+. .| | + + +.+..+.+..+.+ .. + : .. +. +..+.+..+ +. | 20 +-++ : + : + | | : : : : : | | : O : : : : | 15 O-+O O O O O O O O O: O:O O O O: O:O O O O O O O O O O O O | : : : : : | 10 +-+ : : : : | |: : : : : | |: : : : : | 5 +-+ : : : : | |: : : | | : : | 0 +-+--------------------------------------------------------------------+ fio.write_clat_95__us 25 +-+--------------------------------------------------------------------+ | + + +.+..+.+..+ : .. + : .. +. +..+.+..+ +. | | + : + : + | 20 +-+: : : : : | O O O O O O : O O : : O | | : O O O : : O : : O O O O O O O O 15 +-+ O O: : O O: O:O O O | | : : : : : | 10 +-+ : : : : | |: : : : : | |: : : : : | 5 +-+ : : : : | |: : : | | : : | 0 +-+--------------------------------------------------------------------+ fio.latency_2us_ 35 +-+--------------------------------------------------------------------+ O O O O O O O O O O O O O O O O O O O O O O O O O O O O 30 +-+ O | | +.+..+.+..+.+..+.+.. .+ +..+..+.+ +..+.+..+.+..+.+..+.+..+.+..| 25 +-+: + : : : : | | : : : : : | 20 +-+ : : : : | | : : : : : | 15 +-+ : : : : | |: : : : : | 10 +-+ : : : : | |: : : : : | 5 +-+ :: :: | | : : | 0 +-+--------------------------------------------------------------------+ fio.latency_4us_ 14 +-+--------------------------------------------------------------------+ | +..+. + +..+. +.+.. +..+. .| 12 +-+ + +..+.+.. +: .. + .. +. .+.. : +. | | + +. .+ : + : + +..+ : | 10 +-+: +. : : : : + | | : : : : : | 8 +-+ : : : : | | : : : : : | 6 +-+ : : : : | |: : : : : | 4 +-+ : : : : | |: : : : : | 2 +-+ : : | | : : | 0 O-+O-O--O-O--O-O--O-O--O-O--O-O--O--O-O--O-O--O-O--O-O--O-O--O-O--O-O--O fio.latency_20us_ 70 +-+--------------------------------------------------------------------+ | O O O O | 60 +-+ O O O O O O O O O O O O O O O O O O O O O O O | 50 +-+ O O | | | 40 +-+ | | | 30 +-+ | | +. .+.+..+.+..+.+..+.+ +.. .+.+ +.. .+..+.+..+.+..+. .+.+..| 20 +-+ +. : : +. : : + +. | | : : : : : | 10 +-+ : : : : | |: :: :: | 0 +-+--------------------------------------------------------------------+ perf-stat.page-faults 8e+08 +-+-----------------------------------------------------------------+ | +.. +.. +.. | 7e+08 +-+ .+..+. .+.. : : .+.+..+. .+.| 6e+08 +-+ +.+..+ + +.+ : +.+.+ : +.+..+ +..+.+. | | : : : : : | 5e+08 +-+ : : : : | O:O O O O O O O O O O: O:O O O O: O:O O O O O O O O O O O O 4e+08 +-+ : : : : | |: : : : : | 3e+08 +-+ : : : : | 2e+08 +-+ : : : : | | :: :: | 1e+08 +-+ :: :: | | : : | 0 +-+-----------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Xiaolong View attachment "config-4.13.0-04261-g91d25ba" of type "text/plain" (162451 bytes) View attachment "job-script" of type "text/plain" (7388 bytes) View attachment "job.yaml" of type "text/plain" (4995 bytes) View attachment "reproduce" of type "text/plain" (860 bytes)
Powered by blists - more mailing lists