lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210628073153.GA16948@xsang-OptiPlex-9020>
Date:   Mon, 28 Jun 2021 15:31:53 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Chunguang Xu <brookxu@...cent.com>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com
Subject: [trace]  3d3d9c072e:  stress-ng.msg.ops_per_sec 18.5% improvement



Greeting,

FYI, we noticed a 18.5% improvement of stress-ng.msg.ops_per_sec due to commit:


commit: 3d3d9c072e377a0c5fa0a1f661a95134e787db3a ("trace: replace WB_REASON_FOREIGN_FLUSH with a string")
https://git.kernel.org/cgit/linux/kernel/git/rostedt/linux-trace.git for-next


in testcase: stress-ng
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:

	nr_threads: 10%
	disk: 1HDD
	testtime: 60s
	fs: ext4
	class: os
	test: msg
	cpufreq_governor: performance
	ucode: 0x5003006






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml  # generate the yaml file for lkp run
        bin/lkp run                    generated-yaml-file

=========================================================================================
class/compiler/cpufreq_governor/disk/fs/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime/ucode:
  os/gcc-9/performance/1HDD/ext4/x86_64-rhel-8.3/10%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/msg/stress-ng/60s/0x5003006

commit: 
  957cdcd9bd ("ring-buffer: Use fallthrough pseudo-keyword")
  3d3d9c072e ("trace: replace WB_REASON_FOREIGN_FLUSH with a string")

957cdcd9bd7e035d 3d3d9c072e377a0c5fa0a1f661a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 1.997e+08           +18.5%  2.367e+08        stress-ng.msg.ops
   3327615           +18.5%    3943995        stress-ng.msg.ops_per_sec
      1246 ± 10%     +62.2%       2022 ±  5%  stress-ng.time.involuntary_context_switches
      1565            +5.7%       1655        stress-ng.time.percent_of_cpu_this_job_got
    955.26            +5.8%       1010        stress-ng.time.system_time
  40528886 ±  9%     -77.6%    9060408 ± 40%  stress-ng.time.voluntary_context_switches
  20278090 ±  4%     -29.3%   14340237        turbostat.IRQ
  37628340 ±  9%     -76.4%    8880621 ± 51%  cpuidle.POLL.time
  30309198 ±  9%     -77.7%    6751779 ± 55%  cpuidle.POLL.usage
     83.75            -1.3%      82.62        iostat.cpu.idle
     15.92            +7.0%      17.03        iostat.cpu.system
   1248983 ±  9%     -77.5%     280750 ± 40%  vmstat.system.cs
    312514 ±  4%     -29.2%     221116        vmstat.system.in
     16252 ± 19%     -30.5%      11301 ± 16%  softirqs.CPU34.SCHED
     11413 ± 16%     -25.4%       8518 ± 28%  softirqs.CPU38.SCHED
     14279 ± 30%     -43.8%       8025 ± 33%  softirqs.CPU39.SCHED
    119136 ± 35%    +316.6%     496375 ± 35%  numa-numastat.node0.local_node
    181940 ±  6%    +202.3%     549926 ± 33%  numa-numastat.node0.numa_hit
    141089 ± 30%    +107.4%     292570 ± 35%  numa-numastat.node1.local_node
    164880 ±  7%     +97.5%     325601 ± 34%  numa-numastat.node1.numa_hit
    381961          +137.1%     905716 ± 30%  proc-vmstat.numa_hit
    295354          +177.3%     819154 ± 33%  proc-vmstat.numa_local
    393066          +137.5%     933616 ± 30%  proc-vmstat.pgalloc_normal
    263755          +204.3%     802655 ± 35%  proc-vmstat.pgfree
     11952 ±  4%     +36.4%      16297 ±  6%  slabinfo.kmalloc-512.active_objs
    376.33 ±  4%     +36.4%     513.50 ±  6%  slabinfo.kmalloc-512.active_slabs
     12055 ±  4%     +36.4%      16445 ±  6%  slabinfo.kmalloc-512.num_objs
    376.33 ±  4%     +36.4%     513.50 ±  6%  slabinfo.kmalloc-512.num_slabs
     54896 ±  3%     +47.6%      81015 ±  7%  slabinfo.kmalloc-64.active_objs
    858.17 ±  3%     +47.7%       1267 ±  7%  slabinfo.kmalloc-64.active_slabs
     54962 ±  3%     +47.6%      81140 ±  7%  slabinfo.kmalloc-64.num_objs
    858.17 ±  3%     +47.7%       1267 ±  7%  slabinfo.kmalloc-64.num_slabs
   7755346 ± 12%     -76.8%    1802412 ±  7%  interrupts.CAL:Function_call_interrupts
    456.17 ±141%     -92.8%      32.67 ± 63%  interrupts.CPU0.RES:Rescheduling_interrupts
    281.67 ± 86%     -89.3%      30.17 ±120%  interrupts.CPU20.RES:Rescheduling_interrupts
     37397 ± 75%     -80.2%       7409 ± 68%  interrupts.CPU34.CAL:Function_call_interrupts
      2076 ± 96%     -83.4%     343.67 ±167%  interrupts.CPU34.NMI:Non-maskable_interrupts
      2076 ± 96%     -83.4%     343.67 ±167%  interrupts.CPU34.PMI:Performance_monitoring_interrupts
     45020 ± 71%     -95.8%       1880 ± 80%  interrupts.CPU39.CAL:Function_call_interrupts
    407.50 ± 47%     -77.1%      93.50 ±148%  interrupts.CPU68.RES:Rescheduling_interrupts
    443.50 ±103%     -82.1%      79.50 ± 20%  interrupts.CPU93.NMI:Non-maskable_interrupts
    443.50 ±103%     -82.1%      79.50 ± 20%  interrupts.CPU93.PMI:Performance_monitoring_interrupts
     19756 ± 18%     -77.7%       4403 ± 21%  interrupts.RES:Rescheduling_interrupts
  3.43e+09 ±  3%      -7.1%  3.185e+09 ±  2%  perf-stat.i.branch-instructions
      0.74 ±  4%      -0.2        0.58 ±  6%  perf-stat.i.branch-miss-rate%
  24452178 ±  5%     -29.2%   17322512 ±  5%  perf-stat.i.branch-misses
   1290940 ±  9%     -77.6%     289762 ± 39%  perf-stat.i.context-switches
 4.379e+09 ±  3%      -9.4%  3.967e+09 ±  3%  perf-stat.i.dTLB-loads
 2.503e+09 ±  3%     -11.6%  2.212e+09 ±  3%  perf-stat.i.dTLB-stores
     87.84            -5.0       82.86        perf-stat.i.iTLB-load-miss-rate%
  13313579 ±  6%     -36.5%    8456850 ±  6%  perf-stat.i.iTLB-load-misses
 1.688e+10 ±  3%      -6.4%   1.58e+10 ±  2%  perf-stat.i.instructions
      1321 ±  4%     +45.4%       1921 ±  6%  perf-stat.i.instructions-per-iTLB-miss
      0.35 ±  2%      -6.0%       0.33 ±  3%  perf-stat.i.ipc
    111.30 ±  3%      -9.2%     101.04 ±  2%  perf-stat.i.metric.M/sec
     16080 ±  3%    +371.9%      75880 ± 32%  perf-stat.i.node-stores
      0.71 ±  3%      -0.2        0.54 ±  3%  perf-stat.overall.branch-miss-rate%
      2.93 ±  2%      +6.7%       3.12 ±  2%  perf-stat.overall.cpi
     89.24            -5.1       84.11        perf-stat.overall.iTLB-load-miss-rate%
      1272 ±  4%     +47.5%       1876 ±  6%  perf-stat.overall.instructions-per-iTLB-miss
      0.34 ±  3%      -6.3%       0.32 ±  2%  perf-stat.overall.ipc
 3.376e+09 ±  3%      -7.1%  3.135e+09 ±  2%  perf-stat.ps.branch-instructions
  24066583 ±  5%     -29.2%   17045901 ±  5%  perf-stat.ps.branch-misses
   1270763 ±  9%     -77.6%     285171 ± 40%  perf-stat.ps.context-switches
  4.31e+09 ±  3%      -9.4%  3.905e+09 ±  3%  perf-stat.ps.dTLB-loads
 2.463e+09 ±  3%     -11.6%  2.177e+09 ±  3%  perf-stat.ps.dTLB-stores
  13105434 ±  6%     -36.5%    8323297 ±  6%  perf-stat.ps.iTLB-load-misses
 1.662e+10 ±  3%      -6.4%  1.556e+10 ±  2%  perf-stat.ps.instructions
     15767 ±  3%    +374.0%      74743 ± 32%  perf-stat.ps.node-stores
 1.048e+12 ±  2%      -6.4%  9.805e+11 ±  3%  perf-stat.total.instructions
      1.08 ± 17%      +0.4        1.49 ±  6%  perf-profile.calltrace.cycles-pp.get_obj_cgroup_from_current.__kmalloc.load_msg.do_msgsnd.do_syscall_64
      2.12 ±  9%      +0.6        2.77 ±  9%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_memcg_lruvec_state.kfree.free_msg.do_msgrcv
      2.22 ±  9%      +0.7        2.88 ± 10%  perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.kfree.free_msg.do_msgrcv.do_syscall_64
      1.66 ± 23%      +0.7        2.34 ±  9%  perf-profile.calltrace.cycles-pp.ipc_obtain_object_check.do_msgsnd.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.74 ± 10%      +0.9        3.60 ± 10%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_msgrcv.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.94 ± 14%      +1.0        2.98 ± 10%  perf-profile.calltrace.cycles-pp.__mod_memcg_state.__mod_memcg_lruvec_state.__kmalloc.load_msg.do_msgsnd
      2.02 ± 14%      +1.1        3.09 ±  9%  perf-profile.calltrace.cycles-pp.__mod_memcg_lruvec_state.__kmalloc.load_msg.do_msgsnd.do_syscall_64
      1.83 ± 28%      +1.9        3.76 ± 20%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.do_msgsnd.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.22 ± 14%      +2.4        8.66 ±  8%  perf-profile.calltrace.cycles-pp.__kmalloc.load_msg.do_msgsnd.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.17 ± 15%      +2.6        9.80 ±  9%  perf-profile.calltrace.cycles-pp.load_msg.do_msgsnd.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.12 ±  8%      -2.4        0.76 ± 42%  perf-profile.children.cycles-pp.__schedule
      2.49 ±  8%      -1.9        0.60 ± 58%  perf-profile.children.cycles-pp.poll_idle
      1.82 ±  9%      -1.3        0.48 ± 55%  perf-profile.children.cycles-pp.schedule
      1.78 ±  8%      -1.3        0.51 ± 44%  perf-profile.children.cycles-pp.wake_up_q
      1.69 ±  8%      -1.2        0.45 ± 49%  perf-profile.children.cycles-pp.try_to_wake_up
      1.38 ±  9%      -1.1        0.30 ± 22%  perf-profile.children.cycles-pp.schedule_idle
      0.84 ±  8%      -0.6        0.22 ± 56%  perf-profile.children.cycles-pp.dequeue_task_fair
      0.77 ±  6%      -0.6        0.18 ± 37%  perf-profile.children.cycles-pp.ttwu_do_activate
      0.76 ±  9%      -0.6        0.19 ± 59%  perf-profile.children.cycles-pp.dequeue_entity
      0.73 ±  5%      -0.6        0.17 ± 35%  perf-profile.children.cycles-pp.enqueue_task_fair
      0.73 ± 21%      -0.5        0.24 ±100%  perf-profile.children.cycles-pp.flush_smp_call_function_from_idle
      0.68 ± 20%      -0.5        0.21 ± 81%  perf-profile.children.cycles-pp.sched_ttwu_pending
      0.58 ±  6%      -0.4        0.13 ± 38%  perf-profile.children.cycles-pp.enqueue_entity
      0.55 ± 11%      -0.4        0.12 ± 34%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.76 ± 11%      -0.4        0.39 ± 29%  perf-profile.children.cycles-pp.menu_select
      0.53 ± 10%      -0.4        0.17 ± 42%  perf-profile.children.cycles-pp.wake_q_add
      0.44 ± 22%      -0.3        0.11 ± 23%  perf-profile.children.cycles-pp.asm_sysvec_call_function_single
      0.44 ±  8%      -0.3        0.11 ± 32%  perf-profile.children.cycles-pp.update_load_avg
      0.48 ± 24%      -0.3        0.15 ± 59%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
      0.36 ± 14%      -0.3        0.06 ± 52%  perf-profile.children.cycles-pp.set_next_entity
      0.40 ± 13%      -0.3        0.11 ± 84%  perf-profile.children.cycles-pp.update_curr
      0.36 ±  8%      -0.3        0.09 ± 40%  perf-profile.children.cycles-pp.__switch_to
      0.35 ± 22%      -0.3        0.08 ± 27%  perf-profile.children.cycles-pp.finish_task_switch
      0.33 ± 22%      -0.2        0.08 ± 28%  perf-profile.children.cycles-pp.sysvec_call_function_single
      0.34 ±  6%      -0.2        0.10 ± 33%  perf-profile.children.cycles-pp.update_rq_clock
      0.28 ± 23%      -0.2        0.06 ±123%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
      0.28 ± 23%      -0.2        0.07 ± 30%  perf-profile.children.cycles-pp.__sysvec_call_function_single
      0.27 ±  9%      -0.2        0.06 ±145%  perf-profile.children.cycles-pp.tick_nohz_idle_exit
      0.40 ± 15%      -0.2        0.23 ± 31%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.26 ±  5%      -0.2        0.10 ± 32%  perf-profile.children.cycles-pp.sched_clock_cpu
      0.22 ± 35%      -0.2        0.06 ± 87%  perf-profile.children.cycles-pp.__smp_call_single_queue
      0.22 ± 35%      -0.2        0.06 ± 87%  perf-profile.children.cycles-pp.llist_add_batch
      0.22 ± 20%      -0.2        0.06 ±113%  perf-profile.children.cycles-pp.perf_trace_sched_wakeup_template
      0.23 ±  7%      -0.1        0.08 ± 37%  perf-profile.children.cycles-pp.native_sched_clock
      0.30 ± 17%      -0.1        0.20 ± 34%  perf-profile.children.cycles-pp.tick_nohz_next_event
      0.11 ±  8%      -0.1        0.05 ± 48%  perf-profile.children.cycles-pp.read_tsc
      0.12 ± 12%      -0.0        0.08 ± 11%  perf-profile.children.cycles-pp.native_irq_return_iret
      0.07 ± 14%      +0.0        0.10 ± 13%  perf-profile.children.cycles-pp.get_mem_cgroup_from_objcg
      0.16 ± 14%      +0.0        0.19 ±  7%  perf-profile.children.cycles-pp.__get_user_8
      0.14 ±  9%      +0.0        0.18 ±  8%  perf-profile.children.cycles-pp.drain_obj_stock
      0.10 ± 10%      +0.1        0.16 ± 13%  perf-profile.children.cycles-pp.obj_cgroup_charge_pages
      0.18 ± 10%      +0.1        0.23 ± 12%  perf-profile.children.cycles-pp.obj_cgroup_charge
      0.07 ± 15%      +0.1        0.14 ± 12%  perf-profile.children.cycles-pp.msgctl_info
      0.13 ± 13%      +0.1        0.22 ± 11%  perf-profile.children.cycles-pp.ksys_msgctl
      0.22 ± 10%      +0.1        0.33 ± 19%  perf-profile.children.cycles-pp.ipcperms
      0.28 ± 18%      +0.1        0.41 ± 32%  perf-profile.children.cycles-pp.__list_del_entry_valid
      1.09 ± 17%      +0.4        1.51 ±  6%  perf-profile.children.cycles-pp.get_obj_cgroup_from_current
      3.46 ± 15%      +0.9        4.38 ±  9%  perf-profile.children.cycles-pp.ipc_obtain_object_check
      4.07 ± 11%      +1.7        5.77 ±  9%  perf-profile.children.cycles-pp.__mod_memcg_state
      4.26 ± 11%      +1.7        6.00 ±  9%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
      6.26 ± 14%      +2.4        8.72 ±  8%  perf-profile.children.cycles-pp.__kmalloc
      7.18 ± 15%      +2.6        9.82 ±  9%  perf-profile.children.cycles-pp.load_msg
      4.60 ± 16%      +2.8        7.40 ±  9%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      2.31 ±  8%      -1.8        0.56 ± 59%  perf-profile.self.cycles-pp.poll_idle
      0.63 ±  9%      -0.5        0.15 ± 38%  perf-profile.self.cycles-pp.__schedule
      0.53 ± 10%      -0.4        0.17 ± 42%  perf-profile.self.cycles-pp.wake_q_add
      0.35 ±  8%      -0.3        0.08 ± 40%  perf-profile.self.cycles-pp.__switch_to
      0.29 ±  8%      -0.2        0.06 ± 59%  perf-profile.self.cycles-pp.enqueue_entity
      0.24 ±  6%      -0.2        0.06 ± 50%  perf-profile.self.cycles-pp.do_idle
      0.22 ± 35%      -0.2        0.06 ± 87%  perf-profile.self.cycles-pp.llist_add_batch
      0.29 ±  7%      -0.2        0.13 ± 32%  perf-profile.self.cycles-pp.menu_select
      0.22 ±  6%      -0.1        0.08 ± 39%  perf-profile.self.cycles-pp.native_sched_clock
      0.23 ±  9%      -0.1        0.12 ± 32%  perf-profile.self.cycles-pp.do_msg_fill
      0.11 ±  7%      -0.1        0.05 ± 48%  perf-profile.self.cycles-pp.read_tsc
      0.12 ± 12%      -0.0        0.08 ±  8%  perf-profile.self.cycles-pp.native_irq_return_iret
      0.08 ±  8%      -0.0        0.05 ± 45%  perf-profile.self.cycles-pp.wake_up_q
      0.07 ± 14%      +0.0        0.10 ± 13%  perf-profile.self.cycles-pp.get_mem_cgroup_from_objcg
      0.15 ± 12%      +0.0        0.19 ±  7%  perf-profile.self.cycles-pp.__get_user_8
      0.06 ± 13%      +0.0        0.10 ± 14%  perf-profile.self.cycles-pp.msgctl_info
      0.00            +0.1        0.05 ±  7%  perf-profile.self.cycles-pp.format_decode
      0.26 ± 12%      +0.1        0.34 ± 11%  perf-profile.self.cycles-pp.sysvipc_find_ipc
      0.21 ± 11%      +0.1        0.32 ± 19%  perf-profile.self.cycles-pp.ipcperms
      0.06 ± 50%      +0.1        0.17 ± 33%  perf-profile.self.cycles-pp.free_msg
      0.28 ± 17%      +0.1        0.41 ± 31%  perf-profile.self.cycles-pp.__list_del_entry_valid
      0.57 ± 26%      +0.3        0.83 ± 19%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      1.07 ± 17%      +0.4        1.48 ±  6%  perf-profile.self.cycles-pp.get_obj_cgroup_from_current
      2.18 ± 18%      +0.7        2.90 ± 12%  perf-profile.self.cycles-pp.ipc_obtain_object_check
      4.07 ± 11%      +1.7        5.77 ±  9%  perf-profile.self.cycles-pp.__mod_memcg_state
      4.55 ± 16%      +2.8        7.34 ±  9%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath


                                                                                
                               stress-ng.msg.ops_per_sec                        
                                                                                
  4.4e+06 +-----------------------------------------------------------------+   
          |                                  O                              |   
  4.2e+06 |-+                  O        O     O                 O           |   
          |              O       O                 O    O      O            |   
    4e+06 |-O  O OO    O    O O    OO O  O O    O    O    O  O    O  O   OO |   
          |  O      O O    O                      O    O    O       O  O    |   
  3.8e+06 |-+                                                               |   
          |                                                                 |   
  3.6e+06 |-+           .+      .+                                          |   
          |            +  + +.++  +                                         |   
  3.4e+06 |-+     +.   :   +       ++              +.   +                   |   
          |     .+  +.:              :          +.+  +.+                    |   
  3.2e+06 |-++.+      +              : .+   .+ +                            |   
          |.+                         +  +.+  +                             |   
    3e+06 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.13.0-rc5-00009-g3d3d9c072e37" of type "text/plain" (174182 bytes)

View attachment "job-script" of type "text/plain" (8405 bytes)

View attachment "job.yaml" of type "text/plain" (5805 bytes)

View attachment "reproduce" of type "text/plain" (531 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ