lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211215135803.GA34913@xsang-OptiPlex-9020>
Date:   Wed, 15 Dec 2021 21:58:03 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Jens Axboe <axboe@...nel.dk>,
        kernel test robot <oliver.sang@...el.com>,
        Christoph Hellwig <hch@....de>,
        Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com
Subject: [loop]  e3f9387aea:  stress-ng.loop.ops_per_sec 78.9% improvement



Greeting,

FYI, we noticed a 78.9% improvement of stress-ng.loop.ops_per_sec due to commit:


commit: e3f9387aea67742b9d1f4de8e5bb2fd08a8a4584 ("loop: Use pr_warn_once() for loop_control_remove() warning")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: stress-ng
on test machine: 96 threads 2 sockets Ice Lake with 256G memory
with following parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 60s
	class: device
	test: loop
	cpufreq_governor: performance
	ucode: 0xb000280






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
  device/gcc-9/performance/1HDD/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-icl-2sp1/loop/stress-ng/60s/0xb000280

commit: 
  d422f40163 ("zram: only make zram_wb_devops for CONFIG_ZRAM_WRITEBACK")
  e3f9387aea ("loop: Use pr_warn_once() for loop_control_remove() warning")

d422f40163087408 e3f9387aea67742b9d1f4de8e5b 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      3361           +79.7%       6041 ±  2%  stress-ng.loop.ops
     55.12           +78.9%      98.63        stress-ng.loop.ops_per_sec
      4244           +61.3%       6847 ±  2%  stress-ng.time.involuntary_context_switches
    140.71 ±  2%     +25.1%     176.00 ± 11%  stress-ng.time.percent_of_cpu_this_job_got
     90.04 ±  2%     +24.7%     112.27 ± 12%  stress-ng.time.system_time
    152534 ±  8%     +38.8%     211722 ± 13%  stress-ng.time.voluntary_context_switches
      8000 ±  4%     +20.0%       9598 ± 13%  meminfo.Active
      7396 ±  3%     +20.8%       8933 ± 14%  meminfo.Active(anon)
      1.78 ±  4%      +0.5        2.30 ±  9%  mpstat.cpu.all.sys%
      0.36 ±  2%      +0.1        0.50 ±  6%  mpstat.cpu.all.usr%
     61315 ±  2%     +29.6%      79471 ±  4%  turbostat.C1
      0.05            +0.0        0.08 ± 11%  turbostat.C1%
      3804 ±  3%     +56.8%       5965 ±  6%  vmstat.io.bi
     17445 ±  4%     +40.1%      24441 ±  6%  vmstat.system.cs
     30900 ±  2%     +56.3%      48285 ±  6%  softirqs.BLOCK
     10068 ±  8%     -14.0%       8663 ± 11%  softirqs.CPU15.SCHED
      9690 ±  4%     -10.3%       8692 ±  8%  softirqs.CPU32.SCHED
      9637 ±  7%     -14.8%       8207 ± 11%  softirqs.CPU52.SCHED
      9340 ± 10%     -14.3%       8007 ±  6%  softirqs.CPU72.SCHED
      9936 ±  8%     -14.4%       8506 ± 11%  softirqs.CPU82.SCHED
      9724 ±  9%     -12.5%       8510 ± 10%  softirqs.CPU93.SCHED
    207351 ±  9%     +24.3%     257838 ±  8%  softirqs.RCU
      1844 ±  3%     +21.1%       2233 ± 14%  proc-vmstat.nr_active_anon
      5929 ±  4%      +9.3%       6479 ±  3%  proc-vmstat.nr_page_table_pages
      1844 ±  3%     +21.1%       2233 ± 14%  proc-vmstat.nr_zone_active_anon
    645257           +14.3%     737465 ±  3%  proc-vmstat.numa_hit
    554490           +16.5%     645922 ±  3%  proc-vmstat.numa_local
     27004           +56.1%      42164 ±  4%  proc-vmstat.pgactivate
    682350           +17.4%     800875 ±  3%  proc-vmstat.pgalloc_normal
    579709           +20.2%     696873 ±  3%  proc-vmstat.pgfree
    261468 ±  3%     +54.9%     404974 ±  7%  proc-vmstat.pgpgin
     30452 ±  4%     +48.5%      45231 ± 11%  proc-vmstat.unevictable_pgs_culled
 1.351e+09 ±  8%     +22.3%  1.651e+09 ±  4%  perf-stat.i.branch-instructions
     21750 ±  8%     +30.1%      28299 ±  8%  perf-stat.i.context-switches
    903.42 ± 10%     +37.4%       1241 ±  9%  perf-stat.i.cpu-migrations
 1.463e+09 ±  8%     +22.8%  1.796e+09 ±  4%  perf-stat.i.dTLB-loads
 6.545e+08 ±  7%     +22.0%  7.983e+08 ±  4%  perf-stat.i.dTLB-stores
  6.28e+09 ±  8%     +20.9%  7.595e+09 ±  4%  perf-stat.i.instructions
      0.38 ±  6%     +14.1%       0.43 ±  8%  perf-stat.i.ipc
     24.95 ±  7%     -15.6%      21.06 ±  9%  perf-stat.i.major-faults
     32.26 ±  7%     +29.9%      41.91 ±  3%  perf-stat.i.metric.M/sec
    986939 ±  7%     +33.0%    1312402 ±  6%  perf-stat.i.node-load-misses
    320022 ±  7%     +34.4%     430212 ±  4%  perf-stat.i.node-loads
   1097382 ±  7%     +39.2%    1527172 ± 11%  perf-stat.i.node-store-misses
 1.068e+09 ±  2%     +31.8%  1.408e+09 ±  4%  perf-stat.ps.branch-instructions
   6396422 ±  8%     +29.6%    8290131 ±  5%  perf-stat.ps.cache-misses
     17520 ±  4%     +41.0%      24709 ±  7%  perf-stat.ps.context-switches
 1.412e+10 ±  7%     +13.9%  1.609e+10 ±  6%  perf-stat.ps.cpu-cycles
    732.74 ±  4%     +48.3%       1086 ±  7%  perf-stat.ps.cpu-migrations
 1.162e+09 ±  2%     +32.4%  1.539e+09 ±  3%  perf-stat.ps.dTLB-loads
 5.203e+08           +31.1%  6.821e+08 ±  3%  perf-stat.ps.dTLB-stores
 4.946e+09 ±  2%     +30.6%  6.458e+09 ±  3%  perf-stat.ps.instructions
    838233 ±  7%     +40.1%    1174167 ±  5%  perf-stat.ps.node-load-misses
    258560 ±  2%     +44.7%     374131        perf-stat.ps.node-loads
    874444 ±  9%     +49.8%    1310136 ±  7%  perf-stat.ps.node-store-misses
    987545 ±  9%     +25.4%    1238042 ±  8%  perf-stat.ps.node-stores
 3.284e+11           +29.0%  4.238e+11 ±  4%  perf-stat.total.instructions
     36.05 ± 11%      -9.1       26.95 ± 18%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
     35.61 ± 10%      -8.9       26.66 ± 18%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     35.61 ± 10%      -8.9       26.66 ± 18%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
     35.55 ± 10%      -8.9       26.62 ± 18%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     31.96 ± 10%      -8.4       23.53 ± 18%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
     32.49 ±  9%      -8.3       24.19 ± 18%  perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      8.21 ± 77%      -8.2        0.00        perf-profile.calltrace.cycles-pp._printk.loop_control_ioctl.cold.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe
      8.21 ± 77%      -8.2        0.00        perf-profile.calltrace.cycles-pp.vprintk_emit._printk.loop_control_ioctl.cold.__x64_sys_ioctl.do_syscall_64
      8.21 ± 77%      -8.2        0.00        perf-profile.calltrace.cycles-pp.loop_control_ioctl.cold.__x64_sys_ioctl.do_syscall_64.entry_SYSCALL_64_after_hwframe.ioctl
      7.66 ± 81%      -7.7        0.00        perf-profile.calltrace.cycles-pp.console_unlock.vprintk_emit._printk.loop_control_ioctl.cold.__x64_sys_ioctl
      7.28 ± 81%      -7.3        0.00        perf-profile.calltrace.cycles-pp.serial8250_console_write.console_unlock.vprintk_emit._printk.loop_control_ioctl.cold
     19.03 ±  6%      -4.7       14.35 ± 17%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
      0.71 ±  8%      -0.4        0.32 ± 86%  perf-profile.calltrace.cycles-pp.rebalance_domains.__softirqentry_text_start.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      1.12 ± 22%      -0.4        0.76 ± 25%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues
      1.23 ± 17%      +3.2        4.45 ±107%  perf-profile.calltrace.cycles-pp.kernfs_remove.__kobject_del.kobject_del.blk_mq_unregister_hctx.blk_mq_unregister_dev
      1.31 ± 17%      +3.2        4.54 ±105%  perf-profile.calltrace.cycles-pp.__kobject_del.kobject_del.blk_mq_unregister_hctx.blk_mq_unregister_dev.blk_unregister_queue
      1.32 ± 17%      +3.2        4.55 ±104%  perf-profile.calltrace.cycles-pp.kobject_del.blk_mq_unregister_hctx.blk_mq_unregister_dev.blk_unregister_queue.del_gendisk
     36.05 ± 11%      -9.1       26.95 ± 18%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
     36.05 ± 11%      -9.1       26.95 ± 18%  perf-profile.children.cycles-pp.cpu_startup_entry
     36.04 ± 11%      -9.1       26.95 ± 18%  perf-profile.children.cycles-pp.do_idle
     35.61 ± 10%      -8.9       26.66 ± 18%  perf-profile.children.cycles-pp.start_secondary
     32.91 ± 10%      -8.4       24.47 ± 18%  perf-profile.children.cycles-pp.cpuidle_enter
     32.87 ± 10%      -8.4       24.44 ± 18%  perf-profile.children.cycles-pp.cpuidle_enter_state
      8.21 ± 77%      -8.2        0.00        perf-profile.children.cycles-pp.loop_control_ioctl.cold
     19.26 ±  6%      -4.7       14.52 ± 17%  perf-profile.children.cycles-pp.intel_idle
      1.18 ± 22%      -0.4        0.82 ± 23%  perf-profile.children.cycles-pp.scheduler_tick
      0.73 ±  8%      -0.2        0.50 ± 20%  perf-profile.children.cycles-pp.rebalance_domains
      0.60 ± 12%      -0.2        0.44 ± 15%  perf-profile.children.cycles-pp.native_irq_return_iret
      0.43 ±  8%      -0.1        0.31 ± 15%  perf-profile.children.cycles-pp.lapic_next_deadline
      0.39 ±  7%      -0.1        0.29 ± 21%  perf-profile.children.cycles-pp.read_tsc
      0.26 ± 11%      -0.1        0.18 ± 22%  perf-profile.children.cycles-pp.arch_scale_freq_tick
      0.29 ±  7%      -0.1        0.22 ± 17%  perf-profile.children.cycles-pp.native_apic_msr_eoi_write
      0.10 ± 11%      -0.0        0.06 ± 44%  perf-profile.children.cycles-pp.call_cpuidle
      0.09 ± 11%      -0.0        0.06 ± 43%  perf-profile.children.cycles-pp.schedule_timeout
      0.22 ± 24%      +0.4        0.64 ± 84%  perf-profile.children.cycles-pp.device_add
      0.67 ± 22%      +0.7        1.38 ± 60%  perf-profile.children.cycles-pp.kernfs_add_one
      1.81 ± 17%      +1.0        2.81 ± 27%  perf-profile.children.cycles-pp.device_add_disk
      1.32 ± 17%      +3.2        4.55 ±104%  perf-profile.children.cycles-pp.blk_mq_unregister_hctx
      1.38 ± 16%      +3.2        4.62 ±102%  perf-profile.children.cycles-pp.blk_mq_unregister_dev
      1.50 ± 18%      +3.3        4.81 ±100%  perf-profile.children.cycles-pp.kernfs_remove
      1.57 ± 17%      +3.3        4.90 ± 98%  perf-profile.children.cycles-pp.blk_unregister_queue
      1.61 ± 17%      +3.4        5.05 ± 98%  perf-profile.children.cycles-pp.kobject_del
      2.38 ± 20%      +6.7        9.05 ±115%  perf-profile.children.cycles-pp.rwsem_down_write_slowpath
      2.38 ± 17%      +7.1        9.50 ±129%  perf-profile.children.cycles-pp.del_gendisk
     19.24 ±  6%      -4.7       14.51 ± 17%  perf-profile.self.cycles-pp.intel_idle
      1.69 ±  9%      -0.5        1.22 ± 17%  perf-profile.self.cycles-pp.cpuidle_enter_state
      0.60 ± 12%      -0.2        0.44 ± 15%  perf-profile.self.cycles-pp.native_irq_return_iret
      0.43 ±  8%      -0.1        0.31 ± 15%  perf-profile.self.cycles-pp.lapic_next_deadline
      0.38 ±  7%      -0.1        0.28 ± 21%  perf-profile.self.cycles-pp.read_tsc
      0.26 ± 12%      -0.1        0.17 ± 21%  perf-profile.self.cycles-pp.arch_scale_freq_tick
      0.29 ±  6%      -0.1        0.22 ± 19%  perf-profile.self.cycles-pp.native_apic_msr_eoi_write
      0.09 ±  9%      -0.0        0.06 ± 44%  perf-profile.self.cycles-pp.call_cpuidle




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.16.0-rc1-00023-ge3f9387aea67" of type "text/plain" (173517 bytes)

View attachment "job-script" of type "text/plain" (8304 bytes)

View attachment "job.yaml" of type "text/plain" (5671 bytes)

View attachment "reproduce" of type "text/plain" (337 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ