lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Jun 2018 15:16:57 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Dave Chinner <dchinner@...hat.com>
Cc:     "Darrick J. Wong" <darrick.wong@...cle.com>,
        Christoph Hellwig <hch@....de>,
        Carlos Maiolino <cmaiolino@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>, linux-xfs@...r.kernel.org,
        lkp@...org
Subject: [lkp-robot] [xfs]  b027d4c97b:  fio.latency_2ms% +7.1% regression


Greeting,

FYI, we noticed a +7.1%% regression of fio.latency_2ms% due to commit:


commit: b027d4c97b9675c2ad75dec94be4e46dceb3ec74 ("xfs: don't retry xfs_buf_find on XBF_TRYLOCK failure")
https://git.kernel.org/cgit/fs/xfs/xfs-linux.git xfs-4.18-merge

in testcase: fio-basic
on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: xfs
	mount_option: dax
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: randwrite
	bs: 4k
	ioengine: libaio
	test_size: 200G
	cpufreq_governor: performance

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio



Details are as below:
-------------------------------------------------------------------------------------------------->

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/mount_option/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based:
  4k/gcc-7/performance/2pmem/xfs/libaio/x86_64-rhel-7.2/dax/50%/debian-x86_64-2016-08-31.cgz/200s/randwrite/lkp-hsw-ep6/200G/fio-basic/tb

commit: 
  8925a3dc47 ("xfs: make xfs_buf_incore out of line")
  b027d4c97b ("xfs: don't retry xfs_buf_find on XBF_TRYLOCK failure")

8925a3dc4771004b b027d4c97b9675c2ad75dec94b 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     46.56 ±  3%      +7.1       53.61        fio.latency_2ms%
      8.19            +0.2        8.40        fio.latency_100ms%
      0.74 ±  3%      -0.1        0.68 ±  6%  fio.latency_250ms%
     25.20 ±  6%      -7.3       17.86 ±  6%  fio.latency_4ms%
      0.46 ±  9%      +0.2        0.69 ± 13%  fio.latency_750us%
     90130            -1.8%      88543        fio.time.involuntary_context_switches
    617.00            -4.4%     589.75        fio.time.percent_of_cpu_this_job_got
      1162            -4.8%       1106        fio.time.system_time
     10148            -0.8%      10066        fio.write_clat_mean_us
    322.93            -0.8%     320.30        fio.write_slat_mean_us
   7300608 ± 11%     -18.2%    5972992        meminfo.DirectMap2M
      4923 ±  9%     +12.2%       5525 ±  9%  numa-meminfo.node1.KernelStack
     25.67            +2.1%      26.22        boot-time.boot
     14.06            +2.7%      14.44        boot-time.dhcp
     14.93            +3.7%      15.49        boot-time.kernel_boot
   4490207 ±  3%     -85.2%     662708 ±  3%  numa-numastat.node0.local_node
   4498546 ±  3%     -85.1%     668260 ±  2%  numa-numastat.node0.numa_hit
   4648677 ±  2%     -86.8%     615846 ±  3%  numa-numastat.node1.local_node
   4651468 ±  2%     -86.6%     621397 ±  2%  numa-numastat.node1.numa_hit
   3353173 ±  5%     -75.1%     834355 ±  2%  numa-vmstat.node0.numa_hit
   3344442 ±  5%     -75.2%     827896 ±  3%  numa-vmstat.node0.numa_local
      4923 ±  9%     +12.2%       5524 ±  9%  numa-vmstat.node1.nr_kernel_stack
   3043769 ±  5%     -77.6%     680788 ±  3%  numa-vmstat.node1.numa_hit
   2869969 ±  5%     -82.4%     504637 ±  6%  numa-vmstat.node1.numa_local
   9174615           -85.7%    1311343        proc-vmstat.numa_hit
   9163481           -85.8%    1300236        proc-vmstat.numa_local
      1635 ±  4%      -6.6%       1527 ±  4%  proc-vmstat.pgactivate
  13537158           -57.7%    5726921        proc-vmstat.pgalloc_normal
  13517845           -58.1%    5663328        proc-vmstat.pgfree
  50269839 ± 34%     -41.0%   29656634 ± 33%  sched_debug.cfs_rq:/.load.max
  50265844 ± 34%     -41.0%   29653364 ± 33%  sched_debug.cfs_rq:/.runnable_weight.max
   1693994 ± 12%     +28.2%    2172260 ±  5%  sched_debug.cpu.avg_idle.max
    351794 ± 13%     +19.3%     419766 ±  9%  sched_debug.cpu.avg_idle.stddev
    779.56 ±  3%     +17.3%     914.08 ±  6%  sched_debug.cpu.curr->pid.stddev
      0.00 ± 12%     +27.4%       0.00 ± 14%  sched_debug.cpu.next_balance.stddev
      0.42 ±  6%     +14.3%       0.48 ±  8%  sched_debug.cpu.nr_running.stddev
    -35.17            -8.8%     -32.06        sched_debug.cpu.nr_uninterruptible.min
 2.256e+11            -1.3%  2.227e+11        perf-stat.branch-instructions
      2.49            +0.0        2.53        perf-stat.branch-miss-rate%
     11.25            -0.5       10.73 ±  2%  perf-stat.cache-miss-rate%
 2.782e+09            -5.4%  2.632e+09 ±  2%  perf-stat.cache-misses
      1.69            -2.2%       1.65        perf-stat.cpi
 2.367e+12            -3.0%  2.296e+12        perf-stat.cpu-cycles
     68767            +4.2%      71688        perf-stat.cpu-migrations
 4.851e+08            -2.8%  4.716e+08        perf-stat.iTLB-load-misses
      0.59            +2.3%       0.61        perf-stat.ipc
     56.55 ±  2%      -3.6       52.92 ±  4%  perf-stat.node-load-miss-rate%
     54.38            -1.9       52.43        perf-stat.node-store-miss-rate%
 3.276e+08 ±  2%     -15.3%  2.774e+08 ±  7%  perf-stat.node-store-misses
 2.748e+08            -8.6%  2.511e+08 ±  4%  perf-stat.node-stores
     82016            -1.6%      80720        perf-stat.path-length
     12.81 ±  3%      -1.5       11.27 ±  4%  perf-profile.calltrace.cycles-pp.xfs_bmapi_write.xfs_iomap_write_direct.xfs_file_iomap_begin.iomap_apply.dax_iomap_rw
     12.26 ±  4%      -1.5       10.74 ±  4%  perf-profile.calltrace.cycles-pp.xfs_bmapi_convert_unwritten.xfs_bmapi_write.xfs_iomap_write_direct.xfs_file_iomap_begin.iomap_apply
     10.81 ±  5%      -1.4        9.37 ±  4%  perf-profile.calltrace.cycles-pp.xfs_bmap_add_extent_unwritten_real.xfs_bmapi_convert_unwritten.xfs_bmapi_write.xfs_iomap_write_direct.xfs_file_iomap_begin
      2.73 ±  6%      -1.0        1.70 ±  5%  perf-profile.calltrace.cycles-pp.xfs_btree_rshift.xfs_btree_make_block_unfull.xfs_btree_insrec.xfs_btree_insert.xfs_bmap_add_extent_unwritten_real
      3.02 ±  6%      -1.0        1.99 ±  3%  perf-profile.calltrace.cycles-pp.xfs_btree_make_block_unfull.xfs_btree_insrec.xfs_btree_insert.xfs_bmap_add_extent_unwritten_real.xfs_bmapi_convert_unwritten
      3.62 ±  5%      -1.0        2.61 ±  3%  perf-profile.calltrace.cycles-pp.xfs_btree_insrec.xfs_btree_insert.xfs_bmap_add_extent_unwritten_real.xfs_bmapi_convert_unwritten.xfs_bmapi_write
      3.68 ±  5%      -1.0        2.68 ±  3%  perf-profile.calltrace.cycles-pp.xfs_btree_insert.xfs_bmap_add_extent_unwritten_real.xfs_bmapi_convert_unwritten.xfs_bmapi_write.xfs_iomap_write_direct
      1.64 ±  9%      -0.9        0.69 ±  2%  perf-profile.calltrace.cycles-pp.xfs_btree_increment.xfs_btree_rshift.xfs_btree_make_block_unfull.xfs_btree_insrec.xfs_btree_insert
      1.52 ±  9%      -0.9        0.58 ±  4%  perf-profile.calltrace.cycles-pp.xfs_buf_get_map.xfs_buf_read_map.xfs_btree_reada_bufl.xfs_btree_readahead_lblock.xfs_btree_increment
      1.52 ±  9%      -0.9        0.58 ±  4%  perf-profile.calltrace.cycles-pp.xfs_buf_read_map.xfs_btree_reada_bufl.xfs_btree_readahead_lblock.xfs_btree_increment.xfs_btree_rshift
      1.54 ±  9%      -0.9        0.60 ±  4%  perf-profile.calltrace.cycles-pp.xfs_btree_readahead_lblock.xfs_btree_increment.xfs_btree_rshift.xfs_btree_make_block_unfull.xfs_btree_insrec
      1.53 ±  9%      -0.9        0.60 ±  4%  perf-profile.calltrace.cycles-pp.xfs_btree_reada_bufl.xfs_btree_readahead_lblock.xfs_btree_increment.xfs_btree_rshift.xfs_btree_make_block_unfull
      0.67 ±  5%      +0.1        0.76 ±  9%  perf-profile.calltrace.cycles-pp.try_to_wake_up.xlog_grant_head_wake.xfs_log_space_wake.xfs_log_done.xfs_log_commit_cil
      0.71 ±  4%      +0.1        0.82 ±  8%  perf-profile.calltrace.cycles-pp.xlog_grant_head_wake.xfs_log_space_wake.xfs_log_done.xfs_log_commit_cil.__xfs_trans_commit
      2.45 ±  5%      +0.2        2.65 ±  2%  perf-profile.calltrace.cycles-pp.__softirqentry_text_start.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
      3.31 ±  5%      +0.2        3.51 ±  3%  perf-profile.calltrace.cycles-pp.xfs_trans_reserve.xfs_trans_alloc.xfs_iomap_write_direct.xfs_file_iomap_begin.iomap_apply
      3.23 ±  5%      +0.2        3.44 ±  3%  perf-profile.calltrace.cycles-pp.xfs_log_reserve.xfs_trans_reserve.xfs_trans_alloc.xfs_iomap_write_direct.xfs_file_iomap_begin
      6.17 ±  2%      +0.3        6.49 ±  2%  perf-profile.calltrace.cycles-pp.apic_timer_interrupt.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
      6.14 ±  2%      +0.3        6.47 ±  2%  perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.do_idle.cpu_startup_entry
      3.12 ±  8%      +0.3        3.46 ±  3%  perf-profile.calltrace.cycles-pp.xfs_buf_delwri_submit_buffers.xfsaild.kthread.ret_from_fork
      0.00            +0.6        0.58 ±  4%  perf-profile.calltrace.cycles-pp.xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_btree_reada_bufl.xfs_btree_readahead_lblock
      0.00            +2.0        1.97 ±  9%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map
      0.00            +2.2        2.16 ±  9%  perf-profile.calltrace.cycles-pp._raw_spin_lock.xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map
      0.00            +3.4        3.39 ± 11%  perf-profile.calltrace.cycles-pp.xfs_buf_find.xfs_buf_get_map.xfs_buf_read_map.xfs_trans_read_buf_map.xfs_btree_read_buf_block
      6.46 ±  8%      -6.5        0.00        perf-profile.children.cycles-pp._xfs_buf_find
     12.81 ±  3%      -1.5       11.27 ±  4%  perf-profile.children.cycles-pp.xfs_bmapi_write
      6.74 ±  8%      -1.5        5.21 ±  5%  perf-profile.children.cycles-pp.xfs_buf_get_map
     12.28 ±  4%      -1.5       10.75 ±  4%  perf-profile.children.cycles-pp.xfs_bmapi_convert_unwritten
      6.75 ±  8%      -1.5        5.23 ±  5%  perf-profile.children.cycles-pp.xfs_buf_read_map
     10.81 ±  5%      -1.4        9.37 ±  4%  perf-profile.children.cycles-pp.xfs_bmap_add_extent_unwritten_real
      2.31 ±  7%      -1.1        1.23 ±  6%  perf-profile.children.cycles-pp.xfs_btree_reada_bufl
      2.31 ±  7%      -1.1        1.24 ±  6%  perf-profile.children.cycles-pp.xfs_btree_readahead_lblock
      2.74 ±  6%      -1.0        1.71 ±  5%  perf-profile.children.cycles-pp.xfs_btree_rshift
      3.02 ±  6%      -1.0        1.99 ±  4%  perf-profile.children.cycles-pp.xfs_btree_make_block_unfull
      3.65 ±  5%      -1.0        2.63 ±  3%  perf-profile.children.cycles-pp.xfs_btree_insrec
      3.69 ±  5%      -1.0        2.69 ±  2%  perf-profile.children.cycles-pp.xfs_btree_insert
      1.87 ±  8%      -1.0        0.92 ±  3%  perf-profile.children.cycles-pp.xfs_btree_increment
     10.48            -0.9        9.60 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock
      9.58 ±  2%      -0.7        8.84 ±  3%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.56            -0.1        0.44 ±  7%  perf-profile.children.cycles-pp.xfs_btree_decrement
      0.75 ±  8%      -0.1        0.65 ±  9%  perf-profile.children.cycles-pp.xfs_perag_get
      0.25 ±  4%      -0.0        0.20 ± 10%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.24 ±  6%      -0.0        0.20 ±  6%  perf-profile.children.cycles-pp.kmem_cache_free
      0.14 ±  3%      -0.0        0.11 ±  7%  perf-profile.children.cycles-pp.down_trylock
      0.11 ±  7%      -0.0        0.09 ±  8%  perf-profile.children.cycles-pp.memset_erms
      0.06 ±  6%      +0.0        0.08 ± 15%  perf-profile.children.cycles-pp.xfs_trans_ail_update_bulk
      0.12 ± 10%      +0.0        0.15 ±  7%  perf-profile.children.cycles-pp.idle_cpu
      0.19 ±  3%      +0.0        0.23 ±  3%  perf-profile.children.cycles-pp.__blkdev_issue_zero_pages
      0.03 ±100%      +0.0        0.07 ± 17%  perf-profile.children.cycles-pp.kernel_wait4
      0.03 ±100%      +0.0        0.07 ± 15%  perf-profile.children.cycles-pp.__do_sys_wait4
      0.29 ±  8%      +0.1        0.34 ±  9%  perf-profile.children.cycles-pp.bio_alloc_bioset
      0.25 ±  5%      +0.1        0.31 ± 12%  perf-profile.children.cycles-pp.mempool_alloc
      0.55 ±  4%      +0.1        0.60 ±  7%  perf-profile.children.cycles-pp.get_next_timer_interrupt
      0.02 ±173%      +0.1        0.08 ± 15%  perf-profile.children.cycles-pp.swake_up
      0.00            +0.1        0.07 ±  7%  perf-profile.children.cycles-pp.do_wait
      0.77 ±  8%      +0.1        0.92 ±  7%  perf-profile.children.cycles-pp.__save_stack_trace
      2.66 ±  2%      +0.1        2.81 ±  3%  perf-profile.children.cycles-pp.__softirqentry_text_start
      0.80 ±  7%      +0.1        0.95 ±  7%  perf-profile.children.cycles-pp.save_stack_trace_tsk
      0.94 ±  6%      +0.2        1.09 ±  5%  perf-profile.children.cycles-pp.__account_scheduler_latency
      0.93 ±  5%      +0.2        1.09 ±  7%  perf-profile.children.cycles-pp.xlog_grant_head_wake
      2.95            +0.2        3.15 ±  3%  perf-profile.children.cycles-pp.irq_exit
      1.15 ±  6%      +0.2        1.35 ±  5%  perf-profile.children.cycles-pp.enqueue_entity
      1.20 ±  6%      +0.2        1.41 ±  4%  perf-profile.children.cycles-pp.ttwu_do_activate
      1.18 ±  7%      +0.2        1.39 ±  4%  perf-profile.children.cycles-pp.enqueue_task_fair
      6.83            +0.2        7.05        perf-profile.children.cycles-pp.apic_timer_interrupt
      1.24 ±  5%      +0.2        1.47 ±  5%  perf-profile.children.cycles-pp.try_to_wake_up
      6.79            +0.2        7.02        perf-profile.children.cycles-pp.smp_apic_timer_interrupt
      3.12 ±  8%      +0.3        3.46 ±  3%  perf-profile.children.cycles-pp.xfs_buf_delwri_submit_buffers
      0.00            +5.1        5.07 ±  5%  perf-profile.children.cycles-pp.xfs_buf_find
      9.38            -0.7        8.71 ±  3%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.71 ±  9%      -0.1        0.59 ±  5%  perf-profile.self.cycles-pp.xfs_perag_get
      0.96 ±  3%      -0.1        0.85 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock
      0.25 ±  4%      -0.0        0.20 ± 10%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.06 ±  6%      -0.0        0.03 ±100%  perf-profile.self.cycles-pp.tick_sched_timer
      0.09 ±  9%      -0.0        0.07 ± 16%  perf-profile.self.cycles-pp.hrtimer_interrupt
      0.11 ±  3%      -0.0        0.09 ±  8%  perf-profile.self.cycles-pp.memset_erms
      0.18 ±  2%      -0.0        0.17        perf-profile.self.cycles-pp.kmem_cache_free
      0.06 ± 11%      +0.0        0.08 ± 10%  perf-profile.self.cycles-pp.selinux_file_permission
      0.12 ±  3%      +0.0        0.15 ±  5%  perf-profile.self.cycles-pp.xfs_buf_get_map
      0.12 ± 10%      +0.0        0.15 ±  7%  perf-profile.self.cycles-pp.idle_cpu
      0.18 ± 12%      +0.0        0.22 ± 10%  perf-profile.self.cycles-pp.__next_timer_interrupt
      0.01 ±173%      +0.1        0.07 ± 17%  perf-profile.self.cycles-pp.mutex_unlock
      0.00            +0.5        0.48 ± 13%  perf-profile.self.cycles-pp.xfs_buf_find



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.17.0-rc4-00005-gb027d4c" of type "text/plain" (164401 bytes)

View attachment "job-script" of type "text/plain" (7486 bytes)

View attachment "job.yaml" of type "text/plain" (5076 bytes)

View attachment "reproduce" of type "text/plain" (876 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ