lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 7 Sep 2020 16:37:09 +0800 From: kernel test robot <rong.a.chen@...el.com> To: Josh Don <joshdon@...gle.com> Cc: Peter Zijlstra <peterz@...radead.org>, Venkatesh Pallipadi <venki@...gle.com>, LKML <linux-kernel@...r.kernel.org>, x86@...nel.org, lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com, aubrey.li@...ux.intel.com, yu.c.chen@...el.com Subject: [sched/fair] ec73240b16: aim7.jobs-per-min 2.3% improvement Greeting, FYI, we noticed a 2.3% improvement of aim7.jobs-per-min due to commit: commit: ec73240b1627cddfd7cef018c7fa1c32e64a721e ("sched/fair: Ignore cache hotness for SMT migration") https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git sched/core in testcase: aim7 on test machine: 192 threads Cooper Lake with 128G memory with following parameters: disk: 4BRD_12G md: RAID1 fs: xfs test: sync_disk_rw load: 300 cpufreq_governor: performance ucode: 0x86000017 test-description: AIM7 is a traditional UNIX system level benchmark suite which is used to test and measure the performance of multiuser system. test-url: https://sourceforge.net/projects/aimbench/files/aim-suite7/ Details are as below: --------------------------------------------------------------------------------------------------> To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml ========================================================================================= compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase/ucode: gcc-9/performance/4BRD_12G/xfs/x86_64-rhel-8.3/300/RAID1/debian-10.4-x86_64-20200603.cgz/lkp-cpx-4s1/sync_disk_rw/aim7/0x86000017 commit: 5f4a1c4ea4 ("sched/topology: Mark SD_NUMA as SDF_NEEDS_GROUPS") ec73240b16 ("sched/fair: Ignore cache hotness for SMT migration") 5f4a1c4ea44728aa ec73240b1627cddfd7cef018c7f ---------------- --------------------------- %stddev %change %stddev \ | \ 5365 +2.3% 5488 aim7.jobs-per-min 335.52 -2.2% 327.98 aim7.time.elapsed_time 335.52 -2.2% 327.98 aim7.time.elapsed_time.max 5014 ± 2% -7.5% 4638 ± 2% aim7.time.system_time 51059388 -1.5% 50309088 aim7.time.voluntary_context_switches 49.81 ± 2% +4.0% 51.78 iostat.cpu.iowait 9.24 -4.3% 8.84 ± 2% iostat.cpu.system 0.07 ± 24% +138.0% 0.16 ± 31% sched_debug.cfs_rq:/.nr_spread_over.avg 0.20 ± 28% +77.0% 0.36 ± 22% sched_debug.cfs_rq:/.nr_spread_over.stddev 49.38 ± 2% +4.1% 51.38 vmstat.cpu.wa 223673 +2.3% 228851 vmstat.io.bo 298749 +2.1% 304962 proc-vmstat.nr_file_pages 241703 +2.8% 248552 proc-vmstat.nr_unevictable 241703 +2.8% 248552 proc-vmstat.nr_zone_unevictable 5330 -9.3% 4835 ± 2% proc-vmstat.nr_zone_write_pending 1330084 -2.3% 1298931 proc-vmstat.pgfault 1.63 -0.0 1.59 perf-stat.i.branch-miss-rate% 27.64 +0.9 28.52 perf-stat.i.cache-miss-rate% 1197 -4.3% 1146 perf-stat.i.cycles-between-cache-misses 34968537 +4.0% 36353165 perf-stat.i.node-load-misses 3229831 +4.0% 3357767 perf-stat.i.node-loads 1.63 -0.0 1.58 perf-stat.overall.branch-miss-rate% 27.72 +0.9 28.63 perf-stat.overall.cache-miss-rate% 1161 -4.8% 1105 perf-stat.overall.cycles-between-cache-misses 34868866 +4.0% 36247556 perf-stat.ps.node-load-misses 3220778 +4.0% 3348190 perf-stat.ps.node-loads 15.56 ± 3% -2.6 12.92 ± 4% perf-profile.calltrace.cycles-pp.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write.new_sync_write.vfs_write 9.81 ± 4% -1.8 7.97 ± 5% perf-profile.calltrace.cycles-pp._raw_spin_lock.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write.new_sync_write 9.78 ± 4% -1.8 7.95 ± 5% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write 5.69 ± 2% -0.8 4.91 ± 2% perf-profile.calltrace.cycles-pp.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write.new_sync_write 4.01 ± 2% -0.5 3.49 ± 2% perf-profile.calltrace.cycles-pp.remove_wait_queue.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write 3.99 ± 2% -0.5 3.47 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.remove_wait_queue.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync 3.68 ± 2% -0.4 3.29 ± 4% perf-profile.calltrace.cycles-pp.remove_wait_queue.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write 3.67 ± 2% -0.4 3.28 ± 4% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.remove_wait_queue.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync 3.65 ± 2% -0.4 3.26 ± 4% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.remove_wait_queue.__xfs_log_force_lsn.xfs_log_force_lsn 5.69 -0.4 5.34 perf-profile.calltrace.cycles-pp.load_balance.newidle_balance.pick_next_task_fair.__sched_text_start.schedule 1.63 ± 3% -0.3 1.37 ± 2% perf-profile.calltrace.cycles-pp.__sched_text_start.schedule.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync 1.64 ± 3% -0.3 1.39 ± 2% perf-profile.calltrace.cycles-pp.schedule.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write 1.52 ± 3% -0.2 1.27 ± 2% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__sched_text_start.schedule.xlog_wait_on_iclog.__xfs_log_force_lsn 1.51 ± 3% -0.2 1.27 ± 2% perf-profile.calltrace.cycles-pp.newidle_balance.pick_next_task_fair.__sched_text_start.schedule.xlog_wait_on_iclog 2.89 ± 2% -0.2 2.69 ± 2% perf-profile.calltrace.cycles-pp.schedule.io_schedule.wait_on_page_bit.__filemap_fdatawait_range.file_write_and_wait_range 2.90 ± 2% -0.2 2.70 ± 2% perf-profile.calltrace.cycles-pp.io_schedule.wait_on_page_bit.__filemap_fdatawait_range.file_write_and_wait_range.xfs_file_fsync 2.89 ± 2% -0.2 2.69 ± 2% perf-profile.calltrace.cycles-pp.__sched_text_start.schedule.io_schedule.wait_on_page_bit.__filemap_fdatawait_range 2.64 ± 2% -0.2 2.44 ± 2% perf-profile.calltrace.cycles-pp.pick_next_task_fair.__sched_text_start.schedule.io_schedule.wait_on_page_bit 2.64 ± 2% -0.2 2.44 ± 2% perf-profile.calltrace.cycles-pp.newidle_balance.pick_next_task_fair.__sched_text_start.schedule.io_schedule 0.68 ± 2% +0.0 0.73 ± 2% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.wait_for_completion.__flush_work.xlog_cil_force_lsn 0.68 ± 2% +0.0 0.73 ± 2% perf-profile.calltrace.cycles-pp.schedule_timeout.wait_for_completion.__flush_work.xlog_cil_force_lsn.xfs_log_force_lsn 0.68 ± 2% +0.0 0.73 ± 2% perf-profile.calltrace.cycles-pp.__sched_text_start.schedule.schedule_timeout.wait_for_completion.__flush_work 0.69 ± 2% +0.0 0.74 ± 2% perf-profile.calltrace.cycles-pp.wait_for_completion.__flush_work.xlog_cil_force_lsn.xfs_log_force_lsn.xfs_file_fsync 0.79 +0.0 0.84 ± 2% perf-profile.calltrace.cycles-pp.__flush_work.xlog_cil_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write 1.79 ± 2% +0.1 1.88 ± 2% perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork 2.04 +0.1 2.14 ± 2% perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork 3.02 ± 4% +0.3 3.33 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.md_flush_request.raid1_make_request.md_handle_request 3.08 ± 4% +0.3 3.39 ± 3% perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.md_flush_request.raid1_make_request.md_handle_request.md_submit_bio 4.47 ± 3% +0.4 4.84 ± 2% perf-profile.calltrace.cycles-pp.md_flush_request.raid1_make_request.md_handle_request.md_submit_bio.submit_bio_noacct 4.62 ± 3% +0.4 4.99 ± 3% perf-profile.calltrace.cycles-pp.md_submit_bio.submit_bio_noacct.submit_bio.submit_bio_wait.blkdev_issue_flush 4.64 ± 3% +0.4 5.02 ± 3% perf-profile.calltrace.cycles-pp.submit_bio_noacct.submit_bio.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync 4.55 ± 3% +0.4 4.93 ± 3% perf-profile.calltrace.cycles-pp.md_handle_request.md_submit_bio.submit_bio_noacct.submit_bio.submit_bio_wait 4.64 ± 3% +0.4 5.02 ± 3% perf-profile.calltrace.cycles-pp.submit_bio.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.xfs_file_buffered_aio_write 4.66 ± 3% +0.4 5.04 ± 3% perf-profile.calltrace.cycles-pp.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.xfs_file_buffered_aio_write.new_sync_write 4.71 ± 3% +0.4 5.10 ± 3% perf-profile.calltrace.cycles-pp.blkdev_issue_flush.xfs_file_fsync.xfs_file_buffered_aio_write.new_sync_write.vfs_write 8.29 +0.6 8.86 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.remove_wait_queue.xlog_wait_on_iclog.__xfs_log_force_lsn 9.03 +0.9 9.92 perf-profile.calltrace.cycles-pp.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write.new_sync_write 10.21 +0.9 11.14 perf-profile.calltrace.cycles-pp.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write.new_sync_write.vfs_write 4.33 ± 3% +1.1 5.42 ± 4% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.remove_wait_queue.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_log_force_lsn 4.34 ± 3% +1.1 5.44 ± 4% perf-profile.calltrace.cycles-pp.remove_wait_queue.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync 4.73 ± 2% +1.1 5.88 ± 3% perf-profile.calltrace.cycles-pp.xlog_wait_on_iclog.__xfs_log_force_lsn.xfs_log_force_lsn.xfs_file_fsync.xfs_file_buffered_aio_write 24.59 ± 2% -1.7 22.84 ± 2% perf-profile.children.cycles-pp.__xfs_log_force_lsn 11.23 ± 4% -1.6 9.59 ± 4% perf-profile.children.cycles-pp._raw_spin_lock 7.45 -0.3 7.16 perf-profile.children.cycles-pp.schedule 6.76 -0.3 6.47 perf-profile.children.cycles-pp.newidle_balance 6.85 -0.3 6.57 perf-profile.children.cycles-pp.pick_next_task_fair 6.62 -0.3 6.34 perf-profile.children.cycles-pp.load_balance 7.92 -0.3 7.64 perf-profile.children.cycles-pp.__sched_text_start 2.90 ± 2% -0.2 2.70 ± 2% perf-profile.children.cycles-pp.io_schedule 0.70 ± 2% +0.0 0.74 ± 2% perf-profile.children.cycles-pp.wait_for_completion 0.79 +0.0 0.84 ± 2% perf-profile.children.cycles-pp.__flush_work 0.70 ± 2% +0.1 0.75 ± 2% perf-profile.children.cycles-pp.schedule_timeout 0.15 ± 5% +0.1 0.22 ± 4% perf-profile.children.cycles-pp.xlog_write 0.26 ± 2% +0.1 0.32 ± 3% perf-profile.children.cycles-pp.xlog_cil_push_work 0.00 ±387% +0.1 0.08 ± 9% perf-profile.children.cycles-pp.xlog_state_get_iclog_space 1.79 ± 2% +0.1 1.88 ± 2% perf-profile.children.cycles-pp.process_one_work 2.04 +0.1 2.14 ± 2% perf-profile.children.cycles-pp.worker_thread 3.22 ± 4% +0.3 3.53 ± 3% perf-profile.children.cycles-pp._raw_spin_lock_irq 10.42 +0.4 10.79 ± 2% perf-profile.children.cycles-pp.xlog_wait_on_iclog 4.48 ± 3% +0.4 4.86 ± 3% perf-profile.children.cycles-pp.md_flush_request 4.66 ± 3% +0.4 5.04 ± 3% perf-profile.children.cycles-pp.submit_bio_wait 4.71 ± 3% +0.4 5.10 ± 3% perf-profile.children.cycles-pp.blkdev_issue_flush 10.21 +0.9 11.14 perf-profile.children.cycles-pp.xfs_log_force_lsn 0.40 ± 3% -0.0 0.36 ± 3% perf-profile.self.cycles-pp.load_balance aim7.jobs-per-min 5600 +--------------------------------------------------------------------+ |OO | 5550 |-+ | | OO O OO O O O O | 5500 |-+O O OOO OOO OO OO OOO O O OO O OO OO O | | OOO O O O O O O OO OO O O O O | 5450 |-+ O OO O OO OOO OO OO | | + + O O | 5400 |-+ + :+ ++ : + ++++ O + | |+++++++: ++:: + + +++++ ++++: ++ :++++ :: + ++ ++ +::+++ +| 5350 |++ + ++ + ++ ++ ++++++++ +++++ :: : ++ +++++++++++ ++ +:+ +:| | + + + ++ ++ + +: + + | 5300 |-+ + + | | | 5250 +--------------------------------------------------------------------+ [*] bisect-good sample [O] bisect-bad sample Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Thanks, Rong Chen View attachment "config-5.9.0-rc1-00115-gec73240b1627c" of type "text/plain" (170215 bytes) View attachment "job-script" of type "text/plain" (8038 bytes) View attachment "job.yaml" of type "text/plain" (5405 bytes) View attachment "reproduce" of type "text/plain" (1020 bytes)
Powered by blists - more mailing lists