lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210113025549.GC7528@xsang-OptiPlex-9020>
Date:   Wed, 13 Jan 2021 10:55:49 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     kernel test robot <oliver.sang@...el.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        David Laight <David.Laight@...lab.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [poll]  ef0ba05538:  will-it-scale.per_thread_ops 8.9% improvement


Greeting,

FYI, we noticed a 8.9% improvement of will-it-scale.per_thread_ops due to commit:


commit: ef0ba05538299f1391cbe097de36895bb36ecfe6 ("poll: fix performance regression due to out-of-line __put_user()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: will-it-scale
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:

	nr_task: 50%
	mode: thread
	test: poll2
	cpufreq_governor: performance
	ucode: 0x16

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+-----------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_thread_ops 7.1% improvement          |
| test machine     | 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory |
| test parameters  | cpufreq_governor=performance                                          |
|                  | mode=thread                                                           |
|                  | nr_task=16                                                            |
|                  | test=poll2                                                            |
|                  | ucode=0x42e                                                           |
+------------------+-----------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/thread/50%/debian-10.4-x86_64-20200603.cgz/lkp-hsw-4ex1/poll2/will-it-scale/0x16

commit: 
  a91bd6223e ("Revert "init/console: Use ttynull as a fallback when there is no console"")
  ef0ba05538 ("poll: fix performance regression due to out-of-line __put_user()")

a91bd6223ecd46ad ef0ba05538299f1391cbe097de3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  18011165            +8.9%   19618836        will-it-scale.72.threads
    250154            +8.9%     272483        will-it-scale.per_thread_ops
  18011165            +8.9%   19618836        will-it-scale.workload
     46702            +1.7%      47497        proc-vmstat.nr_slab_unreclaimable
    238955 ± 31%     +42.3%     340009 ± 19%  numa-numastat.node1.local_node
    291569 ± 30%     +42.5%     415478 ± 15%  numa-numastat.node1.numa_hit
  33349633 ±  8%     -25.8%   24744913 ± 18%  cpuidle.C1E.usage
 7.521e+09 ± 32%     +54.0%  1.158e+10 ± 26%  cpuidle.C6.time
  10594513 ± 25%     +81.2%   19201840 ± 23%  cpuidle.C6.usage
    102872 ± 10%     -16.0%      86427 ± 11%  syscalls.sys_openat.max
      8422            -9.8%       7597        syscalls.sys_poll.med
      8268            -9.8%       7462        syscalls.sys_poll.min
     16.50 ± 13%     -23.0%      12.71 ± 10%  sched_debug.cfs_rq:/.load_avg.avg
    255.52 ± 22%     -30.4%     177.92 ±  2%  sched_debug.cfs_rq:/.load_avg.max
     43.12 ± 12%     -29.5%      30.38 ± 10%  sched_debug.cfs_rq:/.load_avg.stddev
    186.58 ±  8%     -55.9%      82.38 ±100%  sched_debug.cfs_rq:/.removed.load_avg.max
     27.55 ± 30%     -62.8%      10.26 ±103%  sched_debug.cfs_rq:/.removed.load_avg.stddev
      1.60 ± 37%     -60.5%       0.63 ±120%  sched_debug.cfs_rq:/.removed.runnable_avg.avg
      1.60 ± 37%     -60.5%       0.63 ±120%  sched_debug.cfs_rq:/.removed.util_avg.avg
      1475 ± 53%     -88.6%     167.50 ± 22%  numa-meminfo.node1.Active
      1475 ± 53%     -88.6%     167.50 ± 22%  numa-meminfo.node1.Active(anon)
     17441 ± 19%     +37.9%      24052 ± 10%  numa-meminfo.node2.KReclaimable
    664056 ±  4%     +18.3%     785538 ±  7%  numa-meminfo.node2.MemUsed
    743.00 ± 31%     +98.8%       1476 ± 51%  numa-meminfo.node2.PageTables
     17441 ± 19%     +37.9%      24052 ± 10%  numa-meminfo.node2.SReclaimable
     35651 ±  4%     +26.6%      45148 ±  7%  numa-meminfo.node2.SUnreclaim
     53093 ±  9%     +30.3%      69201 ±  6%  numa-meminfo.node2.Slab
     47310 ±  9%     -13.2%      41067 ±  9%  numa-meminfo.node3.SUnreclaim
    368.50 ± 53%     -88.8%      41.25 ± 22%  numa-vmstat.node1.nr_active_anon
    368.50 ± 53%     -88.8%      41.25 ± 22%  numa-vmstat.node1.nr_zone_active_anon
    183.25 ± 32%    +101.0%     368.25 ± 51%  numa-vmstat.node2.nr_page_table_pages
      4360 ± 19%     +37.9%       6012 ± 10%  numa-vmstat.node2.nr_slab_reclaimable
      8912 ±  4%     +26.6%      11286 ±  7%  numa-vmstat.node2.nr_slab_unreclaimable
    460320 ± 12%     +31.8%     606634 ±  8%  numa-vmstat.node2.numa_hit
    304883 ± 21%     +55.8%     475111 ± 10%  numa-vmstat.node2.numa_local
     11827 ±  9%     -13.2%      10266 ±  9%  numa-vmstat.node3.nr_slab_unreclaimable
    674508 ± 23%     -29.0%     478814 ± 11%  numa-vmstat.node3.numa_hit
    542032 ± 25%     -38.2%     334743 ± 16%  numa-vmstat.node3.numa_local
      2495 ±  7%     +12.7%       2812 ±  3%  slabinfo.PING.active_objs
      2495 ±  7%     +12.7%       2812 ±  3%  slabinfo.PING.num_objs
      2262 ± 12%     +19.5%       2703 ±  6%  slabinfo.fsnotify_mark_connector.active_objs
      2262 ± 12%     +19.5%       2703 ±  6%  slabinfo.fsnotify_mark_connector.num_objs
    901.00 ±  5%     -11.5%     797.00        slabinfo.pool_workqueue.active_objs
    930.50 ±  5%     -11.8%     821.00 ±  2%  slabinfo.pool_workqueue.num_objs
      3144 ±  5%      +9.1%       3430 ±  2%  slabinfo.signal_cache.active_objs
      3144 ±  5%      +9.3%       3437 ±  2%  slabinfo.signal_cache.num_objs
      4087 ±  3%     +10.0%       4496 ±  2%  slabinfo.sock_inode_cache.active_objs
      4087 ±  3%     +10.0%       4496 ±  2%  slabinfo.sock_inode_cache.num_objs
      0.02 ± 17%     -27.8%       0.01 ±  5%  perf-sched.sch_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.futex_wait_queue_me.futex_wait.do_futex
    250.89 ±173%    -100.0%       0.03 ±  8%  perf-sched.sch_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.pipe_read.new_sync_read.vfs_read
    806.12 ± 24%     -50.4%     400.04 ± 66%  perf-sched.wait_and_delay.avg.ms.__traceiter_sched_switch.__traceiter_sched_switch.schedule_timeout.io_schedule_timeout.wait_for_completion_io
     61.00 ± 57%     +63.5%      99.75 ±  7%  perf-sched.wait_and_delay.count.__traceiter_sched_switch.__traceiter_sched_switch.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      1929 ±  2%      -8.6%       1763 ±  4%  perf-sched.wait_and_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.devkmsg_read.vfs_read.ksys_read
      1929 ±  2%      -8.6%       1763 ±  4%  perf-sched.wait_and_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.do_syslog.part.0
      1932 ±  2%      -8.6%       1767 ±  4%  perf-sched.wait_and_delay.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.pipe_read.new_sync_read.vfs_read
    806.11 ± 24%     -46.7%     429.95 ± 51%  perf-sched.wait_time.avg.ms.__traceiter_sched_switch.__traceiter_sched_switch.schedule_timeout.io_schedule_timeout.wait_for_completion_io
      1929 ±  2%      -8.6%       1763 ±  4%  perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.devkmsg_read.vfs_read.ksys_read
      1929 ±  2%      -8.6%       1763 ±  4%  perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.do_syslog.part.0
      1932 ±  2%      -8.6%       1767 ±  4%  perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.pipe_read.new_sync_read.vfs_read
      0.07 ±108%     -77.8%       0.01 ± 62%  perf-sched.wait_time.max.ms.__traceiter_sched_switch.__traceiter_sched_switch.schedule_timeout.wait_for_completion.stop_one_cpu
     72.66            -7.7       64.98 ± 10%  perf-profile.calltrace.cycles-pp.__poll
     60.43            -7.7       52.76 ± 10%  perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     69.09            -7.6       61.48 ± 10%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__poll
     63.12            -7.5       55.66 ± 10%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     61.29            -7.4       53.91 ± 10%  perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.__poll
     24.23            -2.8       21.39 ± 10%  perf-profile.calltrace.cycles-pp.__fget_files.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64
     24.49 ±  2%      +7.6       32.10 ± 22%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     24.49 ±  2%      +7.6       32.10 ± 22%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     24.49 ±  2%      +7.6       32.10 ± 22%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify
     24.46 ±  2%      +7.6       32.09 ± 22%  perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     24.46 ±  2%      +7.6       32.09 ± 22%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
     24.66 ±  2%      +7.8       32.41 ± 22%  perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify
     24.42 ±  2%      +7.8       32.23 ± 22%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
     72.88            -7.7       65.22 ± 10%  perf-profile.children.cycles-pp.__poll
     69.17            -7.6       61.57 ± 10%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     63.19            -7.5       55.73 ± 10%  perf-profile.children.cycles-pp.do_syscall_64
     61.31            -7.4       53.93 ± 10%  perf-profile.children.cycles-pp.__x64_sys_poll
     60.96            -7.4       53.58 ± 10%  perf-profile.children.cycles-pp.do_sys_poll
     25.04            -2.8       22.19 ± 10%  perf-profile.children.cycles-pp.__fget_files
      0.33 ±  5%      -0.1        0.25 ± 20%  perf-profile.children.cycles-pp.perf_tp_event
     24.49 ±  2%      +7.6       32.10 ± 22%  perf-profile.children.cycles-pp.start_secondary
     24.66 ±  2%      +7.8       32.41 ± 22%  perf-profile.children.cycles-pp.secondary_startup_64_no_verify
     24.66 ±  2%      +7.8       32.41 ± 22%  perf-profile.children.cycles-pp.cpu_startup_entry
     24.66 ±  2%      +7.8       32.41 ± 22%  perf-profile.children.cycles-pp.do_idle
     24.63 ±  2%      +7.8       32.41 ± 22%  perf-profile.children.cycles-pp.cpuidle_enter
     24.63 ±  2%      +7.8       32.41 ± 22%  perf-profile.children.cycles-pp.cpuidle_enter_state
     24.59 ±  2%      +7.8       32.40 ± 22%  perf-profile.children.cycles-pp.intel_idle
     24.00            -2.8       21.20 ± 10%  perf-profile.self.cycles-pp.__fget_files
     12.90            -2.4       10.53 ± 10%  perf-profile.self.cycles-pp.do_sys_poll
     24.59 ±  2%      +7.8       32.40 ± 22%  perf-profile.self.cycles-pp.intel_idle
      0.10 ± 10%     +20.6%       0.12 ±  4%  perf-stat.i.MPKI
      0.18            +0.0        0.19        perf-stat.i.branch-miss-rate%
 1.258e+08            +4.6%  1.315e+08        perf-stat.i.branch-misses
  23216759 ±  5%     +11.6%   25900213 ±  7%  perf-stat.i.cache-references
      0.70            -1.4%       0.69        perf-stat.i.cpi
      0.05            -0.0        0.04 ±  3%  perf-stat.i.dTLB-load-miss-rate%
  36187253           -20.1%   28909422 ±  3%  perf-stat.i.dTLB-load-misses
 7.244e+10            +2.0%  7.393e+10        perf-stat.i.dTLB-loads
      0.04            +0.0        0.04        perf-stat.i.dTLB-store-miss-rate%
  18118830            +9.0%   19752182        perf-stat.i.dTLB-store-misses
  4.61e+10            +3.5%  4.773e+10        perf-stat.i.dTLB-stores
  27239224            +6.0%   28873516 ±  2%  perf-stat.i.iTLB-load-misses
 3.005e+11            +1.6%  3.054e+11        perf-stat.i.instructions
     11048            -4.2%      10587 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      1.43            +1.5%       1.46        perf-stat.i.ipc
      1356            +1.2%       1372        perf-stat.i.metric.M/sec
      0.08 ±  5%      +9.4%       0.09 ±  7%  perf-stat.overall.MPKI
      0.16            +0.0        0.17        perf-stat.overall.branch-miss-rate%
      0.70            -1.5%       0.69        perf-stat.overall.cpi
      0.05            -0.0        0.04 ±  3%  perf-stat.overall.dTLB-load-miss-rate%
      0.04            +0.0        0.04        perf-stat.overall.dTLB-store-miss-rate%
     11040            -4.1%      10589 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
      1.44            +1.6%       1.46        perf-stat.overall.ipc
   5022724            -6.7%    4684438        perf-stat.overall.path-length
 1.255e+08            +4.5%  1.311e+08        perf-stat.ps.branch-misses
  23272217 ±  5%     +11.1%   25848995 ±  7%  perf-stat.ps.cache-references
  36073415           -20.2%   28792468 ±  3%  perf-stat.ps.dTLB-load-misses
  7.22e+10            +2.0%  7.361e+10        perf-stat.ps.dTLB-loads
  18057011            +8.9%   19667743        perf-stat.ps.dTLB-store-misses
 4.595e+10            +3.4%  4.753e+10        perf-stat.ps.dTLB-stores
  27136115            +5.9%   28744716 ±  2%  perf-stat.ps.iTLB-load-misses
 2.995e+11            +1.5%  3.041e+11        perf-stat.ps.instructions
 9.046e+13            +1.6%   9.19e+13        perf-stat.total.instructions
     13893 ±  5%     -20.4%      11054 ±  6%  softirqs.CPU101.RCU
     11428 ± 46%    +119.3%      25066 ± 27%  softirqs.CPU101.SCHED
     37214 ± 12%     -51.1%      18200 ± 63%  softirqs.CPU106.SCHED
      8255 ±  8%     +15.2%       9512 ±  7%  softirqs.CPU110.RCU
     38004 ±  5%     -24.1%      28839 ± 26%  softirqs.CPU110.SCHED
     10247 ±  9%     +17.6%      12053 ±  5%  softirqs.CPU113.RCU
     33888 ± 11%     -41.5%      19830 ± 16%  softirqs.CPU113.SCHED
      9500 ± 10%     +30.2%      12366 ±  4%  softirqs.CPU118.RCU
     38569 ±  4%     -54.4%      17583 ± 54%  softirqs.CPU118.SCHED
      9284 ± 11%     -17.5%       7661 ± 12%  softirqs.CPU126.RCU
     14321 ± 27%     +51.5%      21693 ± 18%  softirqs.CPU13.SCHED
     13294 ± 30%    +108.6%      27734 ± 22%  softirqs.CPU130.SCHED
      7801 ±  8%     +33.9%      10446 ± 10%  softirqs.CPU133.RCU
     34662 ± 13%     -35.4%      22383 ± 17%  softirqs.CPU133.SCHED
      8945 ±  9%     +31.6%      11769 ± 14%  softirqs.CPU138.RCU
     34958 ± 16%     -43.5%      19740 ± 42%  softirqs.CPU138.SCHED
      9051 ±  4%     +23.3%      11164 ± 15%  softirqs.CPU141.RCU
     30437 ± 15%     -40.5%      18124 ± 47%  softirqs.CPU15.SCHED
     10040 ±  4%     -18.4%       8190 ± 14%  softirqs.CPU16.RCU
     11827 ± 27%    +164.1%      31241 ± 23%  softirqs.CPU16.SCHED
     20594 ± 23%     -38.9%      12572 ± 52%  softirqs.CPU20.SCHED
     14656 ±  5%     -25.4%      10931 ±  5%  softirqs.CPU21.RCU
      9461 ± 59%    +192.0%      27630 ± 24%  softirqs.CPU21.SCHED
     35725 ±  7%     -44.2%      19932 ± 35%  softirqs.CPU27.SCHED
     33917 ± 12%     -46.8%      18044 ± 43%  softirqs.CPU29.SCHED
     12308 ± 10%     -23.7%       9386 ± 14%  softirqs.CPU34.RCU
      8999 ± 65%    +179.0%      25110 ± 40%  softirqs.CPU34.SCHED
     14707 ±  4%     -20.2%      11729 ±  4%  softirqs.CPU41.RCU
     11362 ± 26%    +106.9%      23512 ± 19%  softirqs.CPU41.SCHED
     15054 ±  7%     -17.7%      12389 ±  8%  softirqs.CPU42.RCU
      8835 ± 45%     +95.3%      17253 ± 35%  softirqs.CPU42.SCHED
     15106 ±  6%     -25.1%      11310 ± 12%  softirqs.CPU46.RCU
      6446 ± 28%    +298.2%      25667 ± 27%  softirqs.CPU46.SCHED
     28882 ± 23%     -33.1%      19313 ± 23%  softirqs.CPU5.SCHED
     34743 ±  9%     -27.3%      25246 ± 16%  softirqs.CPU53.SCHED
      9551 ±  2%     +23.2%      11769 ± 10%  softirqs.CPU56.RCU
     27072 ± 13%     -45.3%      14802 ± 41%  softirqs.CPU56.SCHED
      9149 ± 15%     +23.3%      11285 ± 12%  softirqs.CPU57.RCU
      8294 ±  9%     +37.7%      11420 ±  9%  softirqs.CPU58.RCU
     31613 ±  6%     -51.5%      15332 ± 49%  softirqs.CPU58.SCHED
     11790 ±  9%     -13.9%      10152 ±  6%  softirqs.CPU61.RCU
     10040 ± 57%    +106.2%      20700 ± 17%  softirqs.CPU61.SCHED
     10515 ± 59%    +126.3%      23797 ± 27%  softirqs.CPU66.SCHED
     11976 ± 11%     -18.6%       9745 ±  5%  softirqs.CPU77.RCU
     30773 ± 13%     -28.9%      21866 ±  7%  softirqs.CPU85.SCHED
     12405 ±  8%     -16.3%      10383 ±  8%  softirqs.CPU87.RCU
     42921 ± 46%     -68.9%      13333 ± 63%  softirqs.CPU88.SCHED
     10774 ± 10%     -17.0%       8942 ±  5%  softirqs.CPU92.RCU
     35714 ± 14%     -56.5%      15552 ± 26%  softirqs.CPU93.SCHED
     14302 ±  4%     -17.9%      11748 ± 10%  softirqs.CPU98.RCU
     14121 ±  5%     -12.5%      12355 ± 10%  softirqs.CPU99.RCU
      9617 ± 25%    +120.3%      21192 ± 41%  softirqs.CPU99.SCHED
    279880 ± 11%     -13.0%     243558        interrupts.CAL:Function_call_interrupts
    221.25 ± 25%     -59.3%      90.00 ± 67%  interrupts.CPU101.RES:Rescheduling_interrupts
      1179 ± 20%     +42.3%       1678 ± 19%  interrupts.CPU102.CAL:Function_call_interrupts
    350.50 ± 66%    +136.2%     827.75 ± 36%  interrupts.CPU102.TLB:TLB_shootdowns
    390.25 ± 26%     -35.6%     251.25 ±  7%  interrupts.CPU104.TLB:TLB_shootdowns
     37.25 ±102%    +260.4%     134.25 ± 62%  interrupts.CPU106.RES:Rescheduling_interrupts
     55.25 ± 45%    +154.3%     140.50 ± 39%  interrupts.CPU113.RES:Rescheduling_interrupts
      2207 ± 34%    +156.5%       5660 ± 28%  interrupts.CPU114.NMI:Non-maskable_interrupts
      2207 ± 34%    +156.5%       5660 ± 28%  interrupts.CPU114.PMI:Performance_monitoring_interrupts
     31.50 ± 84%    +165.1%      83.50 ± 54%  interrupts.CPU114.RES:Rescheduling_interrupts
     14.00 ± 74%    +910.7%     141.50 ± 53%  interrupts.CPU118.RES:Rescheduling_interrupts
     90.25 ±108%    +897.0%     899.75 ± 48%  interrupts.CPU118.TLB:TLB_shootdowns
      3236 ± 64%     -48.5%       1666 ± 15%  interrupts.CPU119.CAL:Function_call_interrupts
    195.50 ± 35%     -42.2%     113.00 ± 20%  interrupts.CPU12.RES:Rescheduling_interrupts
    235.75 ± 20%     -41.3%     138.50 ± 46%  interrupts.CPU125.RES:Rescheduling_interrupts
    208.00 ± 36%     -54.3%      95.00 ± 69%  interrupts.CPU126.RES:Rescheduling_interrupts
    167.00 ± 25%     -65.6%      57.50 ± 69%  interrupts.CPU128.RES:Rescheduling_interrupts
    189.25 ± 25%     -42.0%     109.75 ± 42%  interrupts.CPU13.RES:Rescheduling_interrupts
    211.25 ± 23%     -75.9%      51.00 ± 62%  interrupts.CPU130.RES:Rescheduling_interrupts
    109.25 ± 53%     -73.0%      29.50 ± 97%  interrupts.CPU131.RES:Rescheduling_interrupts
      4394 ± 45%     -31.4%       3012 ±  8%  interrupts.CPU136.NMI:Non-maskable_interrupts
      4394 ± 45%     -31.4%       3012 ±  8%  interrupts.CPU136.PMI:Performance_monitoring_interrupts
    107.25 ± 39%     -59.2%      43.75 ± 96%  interrupts.CPU136.RES:Rescheduling_interrupts
     78.00 ± 63%     -61.9%      29.75 ±101%  interrupts.CPU137.RES:Rescheduling_interrupts
     48.25 ± 45%    +187.0%     138.50 ± 56%  interrupts.CPU138.RES:Rescheduling_interrupts
    107.50 ± 41%     -78.1%      23.50 ± 77%  interrupts.CPU139.RES:Rescheduling_interrupts
    216.50 ± 36%     -55.8%      95.75 ± 67%  interrupts.CPU14.RES:Rescheduling_interrupts
      2077 ± 22%     -42.8%       1187 ± 31%  interrupts.CPU140.CAL:Function_call_interrupts
      1080 ± 15%     +67.0%       1804 ± 24%  interrupts.CPU15.CAL:Function_call_interrupts
      1464 ±  2%    +226.9%       4785 ± 41%  interrupts.CPU15.NMI:Non-maskable_interrupts
      1464 ±  2%    +226.9%       4785 ± 41%  interrupts.CPU15.PMI:Performance_monitoring_interrupts
      1988 ±  5%     -33.4%       1324 ± 30%  interrupts.CPU16.CAL:Function_call_interrupts
      5928 ± 31%     -39.3%       3597 ± 29%  interrupts.CPU16.NMI:Non-maskable_interrupts
      5928 ± 31%     -39.3%       3597 ± 29%  interrupts.CPU16.PMI:Performance_monitoring_interrupts
    214.75 ± 18%     -74.5%      54.75 ± 80%  interrupts.CPU16.RES:Rescheduling_interrupts
      1287 ±  9%     -62.9%     478.25 ± 84%  interrupts.CPU16.TLB:TLB_shootdowns
      2130 ±  4%     -29.0%       1513 ± 24%  interrupts.CPU21.CAL:Function_call_interrupts
    236.75 ± 25%     -73.8%      62.00 ± 68%  interrupts.CPU21.RES:Rescheduling_interrupts
      1389 ± 10%     -51.2%     678.00 ± 54%  interrupts.CPU21.TLB:TLB_shootdowns
      2079 ± 24%    +217.2%       6595 ± 33%  interrupts.CPU25.NMI:Non-maskable_interrupts
      2079 ± 24%    +217.2%       6595 ± 33%  interrupts.CPU25.PMI:Performance_monitoring_interrupts
      1143 ± 20%     +47.6%       1686 ± 16%  interrupts.CPU26.CAL:Function_call_interrupts
     37.50 ± 42%    +194.7%     110.50 ± 42%  interrupts.CPU27.RES:Rescheduling_interrupts
      1930 ± 30%    +120.7%       4259 ± 54%  interrupts.CPU29.NMI:Non-maskable_interrupts
      1930 ± 30%    +120.7%       4259 ± 54%  interrupts.CPU29.PMI:Performance_monitoring_interrupts
     40.75 ± 72%    +246.0%     141.00 ± 24%  interrupts.CPU29.RES:Rescheduling_interrupts
      1850 ±  4%     +25.1%       2315 ±  6%  interrupts.CPU32.CAL:Function_call_interrupts
      3210 ± 54%     -56.7%       1388 ± 34%  interrupts.CPU34.CAL:Function_call_interrupts
    233.50 ± 31%     -64.2%      83.50 ± 89%  interrupts.CPU34.RES:Rescheduling_interrupts
      1310 ± 22%     -57.1%     562.75 ± 95%  interrupts.CPU34.TLB:TLB_shootdowns
    213.50 ± 13%     -60.3%      84.75 ± 38%  interrupts.CPU41.RES:Rescheduling_interrupts
      2145 ±  3%     -36.5%       1362 ± 34%  interrupts.CPU46.CAL:Function_call_interrupts
      7988 ±  2%     -27.7%       5774 ± 24%  interrupts.CPU46.NMI:Non-maskable_interrupts
      7988 ±  2%     -27.7%       5774 ± 24%  interrupts.CPU46.PMI:Performance_monitoring_interrupts
    254.25 ± 16%     -71.5%      72.50 ± 96%  interrupts.CPU46.RES:Rescheduling_interrupts
      1375 ±  6%     -59.3%     559.50 ± 83%  interrupts.CPU46.TLB:TLB_shootdowns
      1177 ± 28%     +29.1%       1519 ± 22%  interrupts.CPU5.CAL:Function_call_interrupts
      1798 ± 35%    +144.9%       4404 ± 49%  interrupts.CPU54.NMI:Non-maskable_interrupts
      1798 ± 35%    +144.9%       4404 ± 49%  interrupts.CPU54.PMI:Performance_monitoring_interrupts
      2151 ± 33%    +222.7%       6940 ± 15%  interrupts.CPU58.NMI:Non-maskable_interrupts
      2151 ± 33%    +222.7%       6940 ± 15%  interrupts.CPU58.PMI:Performance_monitoring_interrupts
    226.50 ± 18%     -49.2%     115.00 ± 13%  interrupts.CPU61.RES:Rescheduling_interrupts
      5949 ± 31%     -50.6%       2936 ± 33%  interrupts.CPU66.NMI:Non-maskable_interrupts
      5949 ± 31%     -50.6%       2936 ± 33%  interrupts.CPU66.PMI:Performance_monitoring_interrupts
    229.00 ± 31%     -67.5%      74.50 ± 63%  interrupts.CPU66.RES:Rescheduling_interrupts
    205.50 ± 13%     -47.9%     107.00 ± 49%  interrupts.CPU72.RES:Rescheduling_interrupts
    193.25 ± 46%     -39.8%     116.25 ± 53%  interrupts.CPU73.RES:Rescheduling_interrupts
      2213 ± 29%    +123.1%       4938 ± 32%  interrupts.CPU83.NMI:Non-maskable_interrupts
      2213 ± 29%    +123.1%       4938 ± 32%  interrupts.CPU83.PMI:Performance_monitoring_interrupts
      1185 ± 39%     +74.9%       2072 ± 21%  interrupts.CPU86.CAL:Function_call_interrupts
      8124           -61.1%       3163 ± 57%  interrupts.CPU87.NMI:Non-maskable_interrupts
      8124           -61.1%       3163 ± 57%  interrupts.CPU87.PMI:Performance_monitoring_interrupts
     54.00 ± 57%    +205.6%     165.00 ± 46%  interrupts.CPU88.RES:Rescheduling_interrupts
    179.75 ± 37%    +433.4%     958.75 ± 42%  interrupts.CPU88.TLB:TLB_shootdowns
      4996 ± 38%     +38.6%       6925 ± 17%  interrupts.CPU91.NMI:Non-maskable_interrupts
      4996 ± 38%     +38.6%       6925 ± 17%  interrupts.CPU91.PMI:Performance_monitoring_interrupts
    910.25 ± 19%     +94.8%       1773 ± 36%  interrupts.CPU93.CAL:Function_call_interrupts
     28.75 ±102%    +433.0%     153.25 ± 34%  interrupts.CPU93.RES:Rescheduling_interrupts
    111.75 ±153%    +646.5%     834.25 ± 47%  interrupts.CPU93.TLB:TLB_shootdowns
    229.00 ± 24%     -58.3%      95.50 ± 71%  interrupts.CPU98.RES:Rescheduling_interrupts


                                                                                
                            will-it-scale.per_thread_ops                        
                                                                                
  275000 +------------------------------------------------------------------+   
         |     O   O  O O O O O O O O      O   O O O O O   O   O    O O O O |   
  270000 |-+                                             O   O    O         |   
         |                                                                  |   
         |                            O      O                              |   
  265000 |-+                                                                |   
         |.+.+.+.+.+.. .+.+.+.+.+.+.+.+.+..+.+.+.+                          |   
  260000 |-+          +                          :                          |   
         |                                        :                         |   
  255000 |-+                                      :                         |   
         |                                         :                        |   
         |                                         +.   .+.   .+..+         |   
  250000 |-+                                         +.+   +.+              |   
         |                                                                  |   
  245000 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-ivb-2ep1: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/thread/16/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/poll2/will-it-scale/0x42e

commit: 
  a91bd6223e ("Revert "init/console: Use ttynull as a fallback when there is no console"")
  ef0ba05538 ("poll: fix performance regression due to out-of-line __put_user()")

a91bd6223ecd46ad ef0ba05538299f1391cbe097de3 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   4103673            +7.1%    4394585        will-it-scale.16.threads
    256479            +7.1%     274661        will-it-scale.per_thread_ops
   4103673            +7.1%    4394585        will-it-scale.workload
    766.75            +2.6%     786.89 ±  2%  boot-time.idle
     54928            -2.0%      53825        proc-vmstat.pgreuse
     68.41 ± 11%     +24.0%      84.79 ± 17%  sched_debug.cfs_rq:/.load_avg.stddev
     34.08 ± 30%     +54.1%      52.50 ± 26%  sched_debug.cfs_rq:/.removed.load_avg.stddev
     72.95 ± 20%     +45.8%     106.38 ± 31%  sched_debug.cfs_rq:/.removed.runnable_avg.max
     11.88 ± 28%     +68.7%      20.04 ± 32%  sched_debug.cfs_rq:/.removed.runnable_avg.stddev
     72.95 ± 20%     +45.8%     106.38 ± 31%  sched_debug.cfs_rq:/.removed.util_avg.max
     11.88 ± 28%     +68.8%      20.04 ± 32%  sched_debug.cfs_rq:/.removed.util_avg.stddev
      0.00 ± 17%     +47.4%       0.01 ± 17%  perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.path_openat
      0.02 ±  7%     -35.1%       0.02 ± 18%  perf-sched.sch_delay.max.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64
      1567 ±  7%     -34.7%       1024 ± 35%  perf-sched.wait_and_delay.avg.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
      6003 ± 14%     -31.0%       4141 ± 10%  perf-sched.wait_and_delay.max.ms.worker_thread.kthread.ret_from_fork
      1567 ±  7%     -34.7%       1024 ± 35%  perf-sched.wait_time.avg.ms.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
      3.16 ± 15%     -35.1%       2.05 ± 58%  perf-sched.wait_time.avg.ms.rcu_gp_kthread.kthread.ret_from_fork
      6002 ± 14%     -31.0%       4141 ± 10%  perf-sched.wait_time.max.ms.worker_thread.kthread.ret_from_fork
      0.69 ±  6%      +0.2        0.86 ± 16%  perf-profile.calltrace.cycles-pp.__check_object_size.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.01 ±173%      +0.0        0.06 ± 14%  perf-profile.children.cycles-pp.clockevents_program_event
      0.01 ±173%      +0.1        0.06 ± 17%  perf-profile.children.cycles-pp.poll_select_set_timeout
      0.18 ± 15%      +0.1        0.27 ± 20%  perf-profile.children.cycles-pp.__virt_addr_valid
      0.71 ±  6%      +0.2        0.89 ± 15%  perf-profile.children.cycles-pp.__check_object_size
      0.11 ±  7%      +0.0        0.15 ± 14%  perf-profile.self.cycles-pp.do_syscall_64
      0.01 ±173%      +0.1        0.06 ± 17%  perf-profile.self.cycles-pp.poll_select_set_timeout
      0.17 ± 16%      +0.1        0.26 ± 21%  perf-profile.self.cycles-pp.__virt_addr_valid
      0.64 ±  6%      +0.1        0.73 ± 11%  perf-profile.self.cycles-pp.__fdget
     18348 ±  6%     +14.1%      20941 ±  5%  softirqs.CPU2.RCU
      7162 ±  5%     +25.8%       9007 ±  8%  softirqs.CPU20.RCU
     10576 ± 13%     +19.6%      12644 ± 11%  softirqs.CPU25.RCU
     10627 ± 11%     +31.5%      13970 ± 18%  softirqs.CPU29.RCU
      9132 ±  9%     +28.2%      11710 ± 10%  softirqs.CPU33.RCU
      9969 ± 16%     +27.4%      12699 ± 10%  softirqs.CPU34.RCU
      9463 ±  3%     +14.6%      10843 ±  5%  softirqs.CPU37.RCU
      9952 ±  7%     +21.0%      12041 ±  6%  softirqs.CPU38.RCU
     15774 ±  8%     +15.8%      18261 ±  3%  softirqs.CPU4.RCU
      6414 ±  5%     +27.1%       8151 ± 18%  softirqs.CPU42.RCU
      7342 ±  3%     +23.3%       9057 ± 10%  softirqs.CPU44.RCU
     15373 ±  8%     +17.8%      18113 ±  5%  softirqs.CPU8.RCU
 1.729e+10            -2.7%  1.682e+10        perf-stat.i.branch-instructions
      0.21            +0.0        0.23 ±  2%  perf-stat.i.branch-miss-rate%
  34023615            +5.6%   35930017 ±  2%  perf-stat.i.branch-misses
      0.09            +0.0        0.09        perf-stat.i.dTLB-store-miss-rate%
   8987462            +7.2%    9630858        perf-stat.i.dTLB-store-misses
 1.053e+10            +2.1%  1.075e+10        perf-stat.i.dTLB-stores
   5043576            +7.2%    5405769 ±  2%  perf-stat.i.iTLB-load-misses
     13393            -7.0%      12449        perf-stat.i.instructions-per-iTLB-miss
      0.20            +0.0        0.21 ±  2%  perf-stat.overall.branch-miss-rate%
      0.09            +0.0        0.09        perf-stat.overall.dTLB-store-miss-rate%
     13396            -7.0%      12452        perf-stat.overall.instructions-per-iTLB-miss
   4961998            -7.1%    4607701        perf-stat.overall.path-length
 1.723e+10            -2.7%  1.677e+10        perf-stat.ps.branch-instructions
  33921356            +5.6%   35811411 ±  2%  perf-stat.ps.branch-misses
   8957329            +7.2%    9598546        perf-stat.ps.dTLB-store-misses
  1.05e+10            +2.1%  1.072e+10        perf-stat.ps.dTLB-stores
   5026750            +7.2%    5387439 ±  2%  perf-stat.ps.iTLB-load-misses
      6123 ± 73%     -93.8%     378.00 ± 87%  interrupts.40:PCI-MSI.2621446-edge.eth0-TxRx-5
      6622 ± 27%     -47.9%       3449 ± 42%  interrupts.CPU10.NMI:Non-maskable_interrupts
      6622 ± 27%     -47.9%       3449 ± 42%  interrupts.CPU10.PMI:Performance_monitoring_interrupts
      7050 ± 18%     -50.1%       3516 ± 18%  interrupts.CPU11.NMI:Non-maskable_interrupts
      7050 ± 18%     -50.1%       3516 ± 18%  interrupts.CPU11.PMI:Performance_monitoring_interrupts
    395.50 ± 23%     -46.6%     211.25 ± 27%  interrupts.CPU12.TLB:TLB_shootdowns
      4738 ± 42%     -39.4%       2872 ± 28%  interrupts.CPU14.NMI:Non-maskable_interrupts
      4738 ± 42%     -39.4%       2872 ± 28%  interrupts.CPU14.PMI:Performance_monitoring_interrupts
      7226 ± 24%     -47.1%       3826 ± 16%  interrupts.CPU2.NMI:Non-maskable_interrupts
      7226 ± 24%     -47.1%       3826 ± 16%  interrupts.CPU2.PMI:Performance_monitoring_interrupts
    430.25 ± 68%     -55.3%     192.25 ± 23%  interrupts.CPU21.NMI:Non-maskable_interrupts
    430.25 ± 68%     -55.3%     192.25 ± 23%  interrupts.CPU21.PMI:Performance_monitoring_interrupts
      1706 ± 16%     -30.7%       1181 ± 16%  interrupts.CPU23.CAL:Function_call_interrupts
      1036 ± 11%     +14.8%       1189 ±  4%  interrupts.CPU27.CAL:Function_call_interrupts
    156.50 ± 75%    +113.4%     334.00 ± 19%  interrupts.CPU27.TLB:TLB_shootdowns
      1033 ± 10%     +19.6%       1235 ± 11%  interrupts.CPU29.CAL:Function_call_interrupts
    135.50 ± 91%    +171.4%     367.75 ± 40%  interrupts.CPU29.TLB:TLB_shootdowns
      6123 ± 73%     -93.8%     378.00 ± 87%  interrupts.CPU31.40:PCI-MSI.2621446-edge.eth0-TxRx-5
      1029 ±  3%     +16.6%       1200 ± 10%  interrupts.CPU37.CAL:Function_call_interrupts
      1197 ±  8%     -16.1%       1005 ± 10%  interrupts.CPU5.CAL:Function_call_interrupts
    333.75 ± 30%     -62.5%     125.00 ± 92%  interrupts.CPU5.TLB:TLB_shootdowns





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Oliver Sang


View attachment "config-5.11.0-rc2-00182-gef0ba0553829" of type "text/plain" (172414 bytes)

View attachment "job-script" of type "text/plain" (7795 bytes)

View attachment "job.yaml" of type "text/plain" (5351 bytes)

View attachment "reproduce" of type "text/plain" (336 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ