lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 14 Apr 2022 22:27:20 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Christoph Hellwig <hch@....de>
Cc:     Christoph Hellwig <hch@....de>, lkp@...ts.01.org, lkp@...el.com,
        ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com,
        LKML <linux-kernel@...r.kernel.org>
Subject: [block]  70bed0d544:  fsmark.files_per_sec 92.0% improvement



Greeting,

FYI, we noticed a 92.0% improvement of fsmark.files_per_sec due to commit:


commit: 70bed0d5447e08702c7595d26c88ca37e8eb88b4 ("block: add a bdev_write_cache helper")
git://git.infradead.org/users/hch/block.git block-api

in testcase: fsmark
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz with 128G memory
with following parameters:

	iterations: 1x
	nr_threads: 1t
	disk: 1HDD
	fs: btrfs
	fs2: nfsv4
	filesize: 4K
	test_size: 40M
	sync_method: fsyncBeforeClose
	nr_files_per_directory: 1fpd
	cpufreq_governor: performance
	ucode: 0xd000331

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
  gcc-11/performance/1HDD/4K/nfsv4/btrfs/1x/x86_64-rhel-8.3/1fpd/1t/debian-10.4-x86_64-20200603.cgz/fsyncBeforeClose/lkp-icl-2sp6/40M/fsmark/0xd000331

commit: 
  6cccbfebc0 ("block: add a bdev_nonrot helper")
  70bed0d544 ("block: add a bdev_write_cache helper")

6cccbfebc02395ae 70bed0d5447e08702c7595d26c8 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     19.10           +92.0%      36.67        fsmark.files_per_sec
    536.13           -47.9%     279.40        fsmark.time.elapsed_time
    536.13           -47.9%     279.40        fsmark.time.elapsed_time.max
     53273            +2.7%      54708        fsmark.time.voluntary_context_switches
      1.49            -2.1%       1.46        iostat.cpu.iowait
    908369 ± 17%     -39.4%     550808 ± 28%  numa-numastat.node1.numa_hit
 6.694e+10           -48.0%  3.482e+10        cpuidle..time
 1.385e+08           -47.7%   72505602        cpuidle..usage
      0.03            +0.0        0.04 ±  3%  mpstat.cpu.all.sys%
      0.01 ±  3%      +0.0        0.01 ±  4%  mpstat.cpu.all.usr%
    577.53           -44.4%     321.22        uptime.boot
     70827           -44.7%      39155        uptime.idle
      2334          +102.7%       4732        vmstat.io.bo
      3380           +45.5%       4919        vmstat.system.cs
  1.38e+08           -47.7%   72098570        turbostat.IRQ
     22732 ± 12%     -38.8%      13910 ±  6%  turbostat.POLL
     51.67            -3.9%      49.67 ±  2%  turbostat.PkgTmp
    134519           +15.4%     155275        meminfo.Active
     10873 ±  3%     -32.7%       7312        meminfo.Active(anon)
    123645           +19.7%     147962        meminfo.Active(file)
     29545           -12.3%      25909        meminfo.Shmem
    256478           -36.6%     162537 ± 39%  numa-meminfo.node0.AnonHugePages
      7918 ± 30%     -55.5%       3522 ±  9%  numa-meminfo.node1.Active(anon)
     20189 ± 46%    +484.8%     118058 ± 66%  numa-meminfo.node1.AnonPages
     55896 ± 34%    +176.2%     154400 ± 47%  numa-meminfo.node1.AnonPages.max
     25261 ± 31%    +383.3%     122094 ± 64%  numa-meminfo.node1.Inactive(anon)
      1467 ± 16%     +26.8%       1860 ± 11%  numa-meminfo.node1.PageTables
     12916 ± 22%     -45.2%       7081 ± 55%  numa-meminfo.node1.Shmem
      1978 ± 30%     -55.5%     880.00 ±  9%  numa-vmstat.node1.nr_active_anon
      5049 ± 46%    +484.5%      29514 ± 66%  numa-vmstat.node1.nr_anon_pages
      6319 ± 31%    +383.1%      30528 ± 64%  numa-vmstat.node1.nr_inactive_anon
    366.00 ± 17%     +26.8%     464.17 ± 10%  numa-vmstat.node1.nr_page_table_pages
      3231 ± 22%     -45.1%       1773 ± 55%  numa-vmstat.node1.nr_shmem
      1978 ± 30%     -55.5%     880.00 ±  9%  numa-vmstat.node1.nr_zone_active_anon
      6319 ± 31%    +383.1%      30528 ± 64%  numa-vmstat.node1.nr_zone_inactive_anon
    907485 ± 17%     -39.2%     551338 ± 28%  numa-vmstat.node1.numa_hit
      3311           +42.4%       4714        perf-stat.i.context-switches
    133.20            +1.8%     135.58        perf-stat.i.cpu-migrations
 2.952e+08            +4.3%  3.078e+08        perf-stat.i.dTLB-loads
 1.587e+08            +4.3%  1.655e+08        perf-stat.i.dTLB-stores
      2945            +4.7%       3084        perf-stat.i.minor-faults
     94.72            -1.8       92.97        perf-stat.i.node-load-miss-rate%
      6976 ± 19%     +65.2%      11527 ± 14%  perf-stat.i.node-loads
     56884 ± 12%     +51.6%      86264 ±  6%  perf-stat.i.node-stores
      2946            +4.7%       3085        perf-stat.i.page-faults
     92.90            -2.4       90.53        perf-stat.overall.node-load-miss-rate%
      3305           +42.1%       4697        perf-stat.ps.context-switches
 2.946e+08            +4.1%  3.067e+08        perf-stat.ps.dTLB-loads
 1.584e+08            +4.1%  1.649e+08        perf-stat.ps.dTLB-stores
      2939            +4.5%       3072        perf-stat.ps.minor-faults
      6962 ± 19%     +64.9%      11483 ± 14%  perf-stat.ps.node-loads
     56769 ± 12%     +51.4%      85938 ±  6%  perf-stat.ps.node-stores
      2940            +4.5%       3073        perf-stat.ps.page-faults
   5.8e+11 ±  3%     -46.4%  3.106e+11 ±  4%  perf-stat.total.instructions
      2718 ±  3%     -32.8%       1826        proc-vmstat.nr_active_anon
     30918           +19.5%      36954        proc-vmstat.nr_active_file
     82517            +2.3%      84385        proc-vmstat.nr_anon_pages
    170379            +5.1%     179015        proc-vmstat.nr_dirtied
    160.83           +32.7%     213.50        proc-vmstat.nr_dirty
     87111            +2.3%      89076        proc-vmstat.nr_inactive_anon
      9165            +1.9%       9340        proc-vmstat.nr_mapped
      1104            +7.4%       1186        proc-vmstat.nr_page_table_pages
      7386           -12.3%       6475        proc-vmstat.nr_shmem
    170150            +5.0%     178704        proc-vmstat.nr_written
      2718 ±  3%     -32.8%       1826        proc-vmstat.nr_zone_active_anon
     30918           +19.5%      36954        proc-vmstat.nr_zone_active_file
     87111            +2.3%      89076        proc-vmstat.nr_zone_inactive_anon
    161.33           +33.5%     215.33        proc-vmstat.nr_zone_write_pending
   1722532           -29.5%    1214402        proc-vmstat.numa_hit
   1606723           -31.6%    1098636        proc-vmstat.numa_local
   1722459           -29.5%    1214419        proc-vmstat.pgalloc_normal
   1723177           -42.0%     999350        proc-vmstat.pgfault
   1598401           -32.6%    1077857        proc-vmstat.pgfree
   1260337            +6.1%    1337822        proc-vmstat.pgpgout
    145698           -44.0%      81595        proc-vmstat.pgreuse
     34.69 ± 24%     +42.8%      49.55 ± 16%  sched_debug.cfs_rq:/.load_avg.avg
     49.89 ±  7%     +55.5%      77.57 ±  4%  sched_debug.cfs_rq:/.runnable_avg.avg
    633.30 ±  2%     +15.3%     730.40 ±  7%  sched_debug.cfs_rq:/.runnable_avg.max
    116.72 ±  8%     +25.6%     146.60 ±  6%  sched_debug.cfs_rq:/.runnable_avg.stddev
     49.75 ±  7%     +55.3%      77.28 ±  4%  sched_debug.cfs_rq:/.util_avg.avg
    632.78 ±  2%     +15.1%     728.53 ±  7%  sched_debug.cfs_rq:/.util_avg.max
    116.60 ±  8%     +25.5%     146.31 ±  6%  sched_debug.cfs_rq:/.util_avg.stddev
      4.18 ± 22%     +53.9%       6.44 ± 15%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    178.60 ± 10%     +39.6%     249.40 ± 10%  sched_debug.cfs_rq:/.util_est_enqueued.max
     22.91 ± 16%     +41.2%      32.34 ±  8%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    116236 ±  8%     +24.6%     144804 ±  7%  sched_debug.cpu.avg_idle.stddev
    259878 ±  5%     -38.2%     160679        sched_debug.cpu.clock.avg
    259881 ±  5%     -38.2%     160683        sched_debug.cpu.clock.max
    259874 ±  5%     -38.2%     160675        sched_debug.cpu.clock.min
      1.97 ±  7%     +14.2%       2.26 ±  9%  sched_debug.cpu.clock.stddev
    255028 ±  4%     -38.2%     157678        sched_debug.cpu.clock_task.avg
    255665 ±  5%     -38.2%     158126        sched_debug.cpu.clock_task.max
    249556 ±  5%     -39.2%     151775        sched_debug.cpu.clock_task.min
     11619 ±  3%     -22.5%       9002        sched_debug.cpu.curr->pid.max
      1173 ±  4%      -9.7%       1059 ±  3%  sched_debug.cpu.curr->pid.stddev
      0.03 ±  7%     +26.7%       0.03 ±  5%  sched_debug.cpu.nr_running.avg
      0.15 ±  2%     +10.2%       0.16 ±  2%  sched_debug.cpu.nr_running.stddev
      8223 ±  4%     -15.8%       6924        sched_debug.cpu.nr_switches.avg
      1411 ±  9%     -22.7%       1090 ± 13%  sched_debug.cpu.nr_switches.min
    259875 ±  5%     -38.2%     160676        sched_debug.cpu_clk
    259153 ±  5%     -38.3%     159957        sched_debug.ktime
    261040 ±  5%     -38.2%     161334        sched_debug.sched_clk
     53.97 ±  6%      -6.4       47.54 ±  2%  perf-profile.calltrace.cycles-pp.mwait_idle_with_hints.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
     54.36 ±  6%      -6.4       47.99 ±  2%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     84.36            -2.3       82.02        perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     92.94            -2.0       90.93        perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
     85.53            -2.0       83.56        perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.secondary_startup_64_no_verify
      0.92 ± 11%      +0.1        1.07 ±  4%  perf-profile.calltrace.cycles-pp.rcu_idle_exit.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.29 ±100%      +0.4        0.74 ± 10%  perf-profile.calltrace.cycles-pp.rcu_core.__softirqentry_text_start.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.29 ±101%      +0.5        0.81 ± 11%  perf-profile.calltrace.cycles-pp.worker_thread.kthread.ret_from_fork
      0.10 ±223%      +0.5        0.64 ± 10%  perf-profile.calltrace.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
      3.04 ±  8%      +0.5        3.58 ± 10%  perf-profile.calltrace.cycles-pp.__softirqentry_text_start.__irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      1.18 ±  7%      +0.9        2.05 ± 18%  perf-profile.calltrace.cycles-pp.ret_from_fork
      1.18 ±  7%      +0.9        2.05 ± 18%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
     28.17 ±  7%      +3.8       31.99 ±  2%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     54.25 ±  6%      -6.4       47.90 ±  2%  perf-profile.children.cycles-pp.mwait_idle_with_hints
     54.65 ±  6%      -6.3       48.32 ±  2%  perf-profile.children.cycles-pp.intel_idle
     86.06            -2.0       84.03        perf-profile.children.cycles-pp.cpuidle_enter_state
     86.29            -2.0       84.28        perf-profile.children.cycles-pp.cpuidle_enter
     93.82            -2.0       91.85        perf-profile.children.cycles-pp.cpuidle_idle_call
      0.07 ± 21%      +0.0        0.11 ± 12%  perf-profile.children.cycles-pp.can_stop_idle_tick
      0.05 ± 50%      +0.0        0.09 ± 26%  perf-profile.children.cycles-pp.mmap_region
      0.04 ± 47%      +0.0        0.09 ± 22%  perf-profile.children.cycles-pp.call_transmit
      0.04 ± 47%      +0.0        0.09 ± 22%  perf-profile.children.cycles-pp.xprt_transmit
      0.06 ± 11%      +0.0        0.11 ± 24%  perf-profile.children.cycles-pp.process_backlog
      0.06 ± 17%      +0.0        0.11 ± 20%  perf-profile.children.cycles-pp.__local_bh_enable_ip
      0.04 ± 72%      +0.0        0.09 ± 29%  perf-profile.children.cycles-pp.handle_irq_event
      0.04 ± 72%      +0.0        0.09 ± 29%  perf-profile.children.cycles-pp.__handle_irq_event_percpu
      0.05 ± 45%      +0.1        0.10 ± 20%  perf-profile.children.cycles-pp.ip6_protocol_deliver_rcu
      0.05 ± 45%      +0.1        0.10 ± 20%  perf-profile.children.cycles-pp.tcp_v6_rcv
      0.04 ± 74%      +0.1        0.10 ± 27%  perf-profile.children.cycles-pp.rpc_async_schedule
      0.07 ± 23%      +0.1        0.12 ± 21%  perf-profile.children.cycles-pp.ip6_finish_output2
      0.04 ± 72%      +0.1        0.09 ± 30%  perf-profile.children.cycles-pp.__common_interrupt
      0.05 ± 45%      +0.1        0.10 ± 20%  perf-profile.children.cycles-pp.ip6_input_finish
      0.05 ± 46%      +0.1        0.10 ± 19%  perf-profile.children.cycles-pp.__netif_receive_skb_one_core
      0.06 ± 13%      +0.1        0.11 ± 21%  perf-profile.children.cycles-pp.__napi_poll
      0.31 ± 10%      +0.1        0.37 ±  4%  perf-profile.children.cycles-pp.error_entry
      0.07 ± 23%      +0.1        0.13 ± 18%  perf-profile.children.cycles-pp.ip6_xmit
      0.04 ± 73%      +0.1        0.09 ± 23%  perf-profile.children.cycles-pp.xs_tcp_send_request
      0.06 ± 13%      +0.1        0.12 ± 18%  perf-profile.children.cycles-pp.net_rx_action
      0.04 ± 73%      +0.1        0.09 ± 22%  perf-profile.children.cycles-pp.xprt_request_transmit
      0.04 ± 71%      +0.1        0.09 ± 23%  perf-profile.children.cycles-pp.tcp_v6_do_rcv
      0.04 ± 71%      +0.1        0.09 ± 23%  perf-profile.children.cycles-pp.tcp_rcv_established
      0.02 ±145%      +0.1        0.08 ± 26%  perf-profile.children.cycles-pp.inode_permission
      0.07 ± 23%      +0.1        0.13 ± 17%  perf-profile.children.cycles-pp.inet6_csk_xmit
      0.08 ± 17%      +0.1        0.14 ± 14%  perf-profile.children.cycles-pp.__tcp_transmit_skb
      0.05 ± 48%      +0.1        0.11 ± 20%  perf-profile.children.cycles-pp.rpc_run_task
      0.04 ± 71%      +0.1        0.10 ± 23%  perf-profile.children.cycles-pp.queue_work_on
      0.05 ± 46%      +0.1        0.11 ± 20%  perf-profile.children.cycles-pp.rpc_execute
      0.08 ± 23%      +0.1        0.15 ± 23%  perf-profile.children.cycles-pp.svc_recv
      0.08 ± 25%      +0.1        0.15 ± 38%  perf-profile.children.cycles-pp.do_softirq
      0.07 ±  9%      +0.1        0.14 ± 16%  perf-profile.children.cycles-pp.__tcp_push_pending_frames
      0.07 ± 11%      +0.1        0.14 ± 16%  perf-profile.children.cycles-pp.tcp_write_xmit
      0.10 ± 23%      +0.1        0.18 ± 12%  perf-profile.children.cycles-pp.__rpc_execute
      0.08 ± 14%      +0.1        0.15 ± 14%  perf-profile.children.cycles-pp.__queue_work
      0.07 ± 10%      +0.1        0.15 ± 15%  perf-profile.children.cycles-pp.tcp_sock_set_cork
      0.15 ± 16%      +0.1        0.24 ± 14%  perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
      0.13 ± 27%      +0.1        0.23 ± 24%  perf-profile.children.cycles-pp.open
      0.22 ± 12%      +0.1        0.32 ± 11%  perf-profile.children.cycles-pp.try_to_wake_up
      0.18 ± 18%      +0.1        0.30 ± 19%  perf-profile.children.cycles-pp.perf_trace_sched_switch
      0.03 ±100%      +0.1        0.16 ± 45%  perf-profile.children.cycles-pp.btree_csum_one_bio
      0.03 ±100%      +0.1        0.16 ± 45%  perf-profile.children.cycles-pp.csum_one_extent_buffer
      0.29 ± 17%      +0.2        0.44 ± 10%  perf-profile.children.cycles-pp.unwind_next_frame
      0.32 ± 27%      +0.2        0.48 ± 14%  perf-profile.children.cycles-pp.io_serial_in
      0.40 ± 17%      +0.2        0.59 ± 10%  perf-profile.children.cycles-pp.get_perf_callchain
      0.40 ± 17%      +0.2        0.59 ± 10%  perf-profile.children.cycles-pp.perf_callchain
      0.34 ± 16%      +0.2        0.53 ± 10%  perf-profile.children.cycles-pp.perf_callchain_kernel
      0.45 ± 18%      +0.2        0.64 ± 10%  perf-profile.children.cycles-pp.process_one_work
      0.43 ± 16%      +0.2        0.62 ± 10%  perf-profile.children.cycles-pp.perf_prepare_sample
      0.36 ± 19%      +0.2        0.58 ± 12%  perf-profile.children.cycles-pp.note_gp_changes
      0.48 ± 16%      +0.2        0.71 ± 11%  perf-profile.children.cycles-pp.perf_event_output_forward
      0.55 ± 12%      +0.2        0.79 ±  8%  perf-profile.children.cycles-pp.rcu_core
      0.48 ± 15%      +0.2        0.72 ± 11%  perf-profile.children.cycles-pp.__perf_event_overflow
      0.50 ± 15%      +0.2        0.75 ± 11%  perf-profile.children.cycles-pp.perf_tp_event
      0.52 ± 14%      +0.3        0.81 ± 11%  perf-profile.children.cycles-pp.worker_thread
      0.99 ± 13%      +0.3        1.28 ±  5%  perf-profile.children.cycles-pp.irqtime_account_irq
      1.54 ± 12%      +0.4        1.90 ±  4%  perf-profile.children.cycles-pp.sched_clock_cpu
      3.20 ±  8%      +0.7        3.87 ±  9%  perf-profile.children.cycles-pp.__softirqentry_text_start
      3.88 ±  9%      +0.8        4.64 ± 10%  perf-profile.children.cycles-pp.__irq_exit_rcu
      1.18 ±  7%      +0.9        2.05 ± 18%  perf-profile.children.cycles-pp.kthread
      1.19 ±  7%      +0.9        2.07 ± 18%  perf-profile.children.cycles-pp.ret_from_fork
     25.11 ±  8%      +3.4       28.51 ±  2%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
     54.23 ±  6%      -6.4       47.83 ±  2%  perf-profile.self.cycles-pp.mwait_idle_with_hints
      0.22 ± 11%      +0.1        0.30 ± 17%  perf-profile.self.cycles-pp.sched_clock_cpu
      0.32 ± 27%      +0.2        0.48 ± 14%  perf-profile.self.cycles-pp.io_serial_in
      1.19 ± 12%      +0.2        1.44 ±  4%  perf-profile.self.cycles-pp.native_sched_clock




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.18.0-rc1-00362-g70bed0d5447e" of type "text/plain" (163540 bytes)

View attachment "job-script" of type "text/plain" (8500 bytes)

View attachment "job.yaml" of type "text/plain" (5927 bytes)

View attachment "reproduce" of type "text/plain" (875 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ