lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200616073908.GG5653@shao2-debian>
Date:   Tue, 16 Jun 2020 15:39:08 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Jirka Hladky <jhladky@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Hillf Danton <hdanton@...a.com>,
        Rik van Riel <riel@...riel.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [sched/core] 2ebb177175: will-it-scale.per_thread_ops -3.7%
 regression

Greeting,

FYI, we noticed a -3.7% regression of will-it-scale.per_thread_ops due to commit:


commit: 2ebb17717550607bcd85fb8cf7d24ac870e9d762 ("sched/core: Offload wakee task activation if it the wakee is descheduling")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 8 threads Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz with 16G memory
with following parameters:

	nr_task: 100%
	mode: thread
	test: pthread_mutex1
	cpufreq_governor: performance
	ucode: 0x21

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+--------------------------------------------------------------------------+
| testcase: change | lmbench3: lmbench3.AF_UNIX.sock.stream.bandwidth.MB/sec 5.1% improvement |
| test machine     | 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory         |
| test parameters  | cpufreq_governor=performance                                             |
|                  | mode=development                                                         |
|                  | nr_threads=50%                                                           |
|                  | test=UNIX                                                                |
|                  | test_memory_size=50%                                                     |
|                  | ucode=0x7000019                                                          |
+------------------+--------------------------------------------------------------------------+
| testcase: change | vm-scalability: vm-scalability.throughput 4.6% improvement               |
| test machine     | 16 threads Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz with 32G memory        |
| test parameters  | cpufreq_governor=performance                                             |
|                  | runtime=300s                                                             |
|                  | size=2T                                                                  |
|                  | test=shm-pread-seq-mt                                                    |
|                  | ucode=0xca                                                               |
+------------------+--------------------------------------------------------------------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-7.6/thread/100%/debian-x86_64-20191114.cgz/lkp-ivb-d01/pthread_mutex1/will-it-scale/0x21

commit: 
  c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
  2ebb177175 ("sched/core: Offload wakee task activation if it the wakee is descheduling")

c6e7bd7afaeb3af5 2ebb17717550607bcd85fb8cf7d 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           25%           1:4     dmesg.RIP:find_vma
          1:4          -25%            :4     dmesg.RIP:loop
           :4           25%           1:4     dmesg.RIP:poll_idle
           :4           25%           1:4     dmesg.RIP:release_pages
           :4           25%           1:4     kmsg.a562403>]usb_hcd_irq
          1:4          -25%            :4     kmsg.a9be0b>]usb_hcd_irq
           :4           25%           1:4     kmsg.c867d3b>]usb_hcd_irq
           :4           25%           1:4     kmsg.e2bf9>]usb_hcd_irq
          1:4          -25%            :4     kmsg.e6afaad>]usb_hcd_irq
          1:4          -25%            :4     kmsg.ef22>]usb_hcd_irq
         %stddev     %change         %stddev
             \          |                \  
   1564377            -3.7%    1506585        will-it-scale.per_thread_ops
  12515025            -3.7%   12052684        will-it-scale.workload
      7295 ± 18%     +32.3%       9651 ±  3%  slabinfo.kmalloc-32.active_objs
      7295 ± 18%     +32.3%       9651 ±  3%  slabinfo.kmalloc-32.num_objs
     52712 ±167%     -99.8%     122.25 ±  9%  softirqs.CPU7.NET_RX
    167833 ± 37%     -96.0%       6770 ± 70%  softirqs.NET_RX
      8.17            -6.2        1.95 ±  2%  mpstat.cpu.all.idle%
      0.05 ± 39%      -0.0        0.00 ± 92%  mpstat.cpu.all.soft%
     36.39            +4.6       40.97        mpstat.cpu.all.usr%
     54.75            +2.3%      56.00        vmstat.cpu.sy
     36.00           +11.8%      40.25        vmstat.cpu.us
    879009            +9.8%     964967        vmstat.system.cs
     80622          +438.9%     434480        vmstat.system.in
   8567964 ± 11%     -24.3%    6482942 ±  4%  cpuidle.C1.time
   7399663 ± 10%     -46.7%    3944245 ±  5%  cpuidle.C1.usage
   4762729 ± 15%     -76.7%    1108338 ±  4%  cpuidle.C1E.time
   2825633 ± 12%     -92.5%     213227 ± 11%  cpuidle.C1E.usage
   1392828 ± 18%     -89.8%     142380 ± 13%  cpuidle.C3.usage
   9138464 ± 19%     -39.7%    5510711 ± 12%  cpuidle.C6.time
  49832321           -80.0%    9988732 ±  5%  cpuidle.POLL.time
  67398769 ±  2%     -89.1%    7335934 ±  6%  cpuidle.POLL.usage
     54149 ±  3%     +12.6%      60956 ±  5%  sched_debug.cfs_rq:/.load.stddev
     10821 ± 78%    -121.0%      -2277        sched_debug.cfs_rq:/.spread0.avg
     -6726          +202.0%     -20313        sched_debug.cfs_rq:/.spread0.min
  16417766            +9.9%   18035507        sched_debug.cpu.nr_switches.avg
  15873293           +10.9%   17597029        sched_debug.cpu.nr_switches.min
     12.58 ± 14%     -22.5%       9.75 ± 24%  sched_debug.cpu.nr_uninterruptible.max
  16413879            +9.9%   18031360        sched_debug.cpu.sched_count.avg
  15869787           +10.9%   17594327        sched_debug.cpu.sched_count.min
   8083678           +10.0%    8890426        sched_debug.cpu.sched_goidle.avg
   7842348           +11.0%    8704700        sched_debug.cpu.sched_goidle.min
   8254567            +9.8%    9067404        sched_debug.cpu.ttwu_count.avg
   8516632           +10.5%    9406669        sched_debug.cpu.ttwu_count.max
     96163 ±169%     -99.8%     181.75 ±  3%  interrupts.56:PCI-MSI.528392-edge.eth3-TxRx-7
      1402 ± 11%  +8.9e+06%  1.247e+08        interrupts.CAL:Function_call_interrupts
     94.75 ± 23%  +1.6e+07%   15469594 ±  2%  interrupts.CPU0.CAL:Function_call_interrupts
   2421597 ±  4%     -91.6%     203339 ±  5%  interrupts.CPU0.RES:Rescheduling_interrupts
    208.75 ± 75%  +7.4e+06%   15412781        interrupts.CPU1.CAL:Function_call_interrupts
   2235968 ±  5%     -91.6%     188736 ±  6%  interrupts.CPU1.RES:Rescheduling_interrupts
    126.00 ±  6%  +1.2e+07%   15593440 ±  2%  interrupts.CPU2.CAL:Function_call_interrupts
   2283945 ±  6%     -91.4%     195832 ±  2%  interrupts.CPU2.RES:Rescheduling_interrupts
    132.25 ±  4%  +1.2e+07%   15737578 ±  3%  interrupts.CPU3.CAL:Function_call_interrupts
   2219149 ±  3%     -90.9%     201796 ± 10%  interrupts.CPU3.RES:Rescheduling_interrupts
    228.50 ± 54%  +6.9e+06%   15654017 ±  2%  interrupts.CPU4.CAL:Function_call_interrupts
   2366589           -91.9%     190886 ±  8%  interrupts.CPU4.RES:Rescheduling_interrupts
    357.75 ± 72%  +4.3e+06%   15413624 ±  2%  interrupts.CPU5.CAL:Function_call_interrupts
   2251765 ±  3%     -92.3%     172812 ±  7%  interrupts.CPU5.RES:Rescheduling_interrupts
    138.00 ±  6%  +1.1e+07%   15629542        interrupts.CPU6.CAL:Function_call_interrupts
   2225560 ±  4%     -91.7%     185344 ±  7%  interrupts.CPU6.RES:Rescheduling_interrupts
     96163 ±169%     -99.8%     181.75 ±  3%  interrupts.CPU7.56:PCI-MSI.528392-edge.eth3-TxRx-7
    116.50 ±  7%  +1.4e+07%   15816683 ±  2%  interrupts.CPU7.CAL:Function_call_interrupts
   2361144 ±  3%     -91.8%     194748 ±  8%  interrupts.CPU7.RES:Rescheduling_interrupts
  18365719           -91.7%    1533497 ±  4%  interrupts.RES:Rescheduling_interrupts
    103.25 ±  5%    +159.1%     267.50 ±  5%  interrupts.TLB:TLB_shootdowns
     13.47 ±  3%     -19.3%      10.86 ±  3%  perf-stat.i.MPKI
 1.746e+09            +1.9%  1.779e+09        perf-stat.i.branch-instructions
  47675183            +1.8%   48541350        perf-stat.i.branch-misses
 1.145e+08 ±  3%     -18.3%   93535517 ±  3%  perf-stat.i.cache-references
    885445            +9.7%     971331        perf-stat.i.context-switches
      3.43            -1.1%       3.39        perf-stat.i.cpi
 2.155e+09            +3.4%  2.228e+09        perf-stat.i.dTLB-loads
  1.73e+09            +5.6%  1.827e+09        perf-stat.i.dTLB-stores
   5869850 ±  5%     -16.5%    4900813        perf-stat.i.iTLB-load-misses
    786437 ± 13%     -30.9%     543649 ± 16%  perf-stat.i.iTLB-loads
 8.535e+09            +1.3%  8.642e+09        perf-stat.i.instructions
      1492 ±  4%     +20.6%       1798        perf-stat.i.instructions-per-iTLB-miss
      0.29            +1.1%       0.30        perf-stat.i.ipc
    720.16            +3.2%     742.92        perf-stat.i.metric.M/sec
     13.41 ±  3%     -19.3%      10.82 ±  3%  perf-stat.overall.MPKI
      3.42            -1.1%       3.39        perf-stat.overall.cpi
      1457 ±  4%     +21.0%       1763        perf-stat.overall.instructions-per-iTLB-miss
      0.29            +1.1%       0.30        perf-stat.overall.ipc
    205548            +5.0%     215927        perf-stat.overall.path-length
  1.74e+09            +1.9%  1.773e+09        perf-stat.ps.branch-instructions
  47519673            +1.8%   48383718        perf-stat.ps.branch-misses
 1.141e+08 ±  3%     -18.3%   93221019 ±  3%  perf-stat.ps.cache-references
    882476            +9.7%     968061        perf-stat.ps.context-switches
 2.148e+09            +3.4%  2.221e+09        perf-stat.ps.dTLB-loads
 1.724e+09            +5.6%  1.821e+09        perf-stat.ps.dTLB-stores
   5850190 ±  5%     -16.5%    4884334        perf-stat.ps.iTLB-load-misses
    783812 ± 13%     -30.9%     541830 ± 16%  perf-stat.ps.iTLB-loads
 8.507e+09            +1.3%  8.614e+09        perf-stat.ps.instructions
 2.572e+12            +1.2%  2.602e+12        perf-stat.total.instructions
      8.95 ±  2%      -6.5        2.50 ±  3%  perf-profile.calltrace.cycles-pp.try_to_wake_up.wake_up_q.futex_wake.do_futex.__x64_sys_futex
      9.15 ±  2%      -6.4        2.79 ±  3%  perf-profile.calltrace.cycles-pp.wake_up_q.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
     22.31            -4.6       17.69        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_unlock_wake
     21.97            -4.6       17.37        perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     40.70            -4.6       36.13        perf-profile.calltrace.cycles-pp.__lll_unlock_wake
     23.13            -4.6       18.56        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_unlock_wake
     30.54            -4.5       26.03        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_unlock_wake
     31.06            -4.5       26.59        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__lll_unlock_wake
      2.81 ±  5%      -2.4        0.40 ± 58%  perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      3.15 ± 10%      -1.9        1.28 ± 10%  perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      3.11 ± 10%      -1.8        1.26 ± 10%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
      1.41            -0.4        0.96        perf-profile.calltrace.cycles-pp.pick_next_task_fair.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry
      1.02 ±  2%      -0.4        0.65 ±  2%  perf-profile.calltrace.cycles-pp.set_next_entity.pick_next_task_fair.__sched_text_start.schedule_idle.do_idle
      4.32            +0.2        4.52        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__lll_unlock_wake
      0.85 ±  4%      +0.2        1.10 ±  8%  perf-profile.calltrace.cycles-pp.unwind_next_frame.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.update_stats_enqueue_sleeper
      4.13            +0.3        4.40        perf-profile.calltrace.cycles-pp.__sched_text_start.schedule.futex_wait_queue_me.futex_wait.do_futex
      4.25            +0.3        4.55        perf-profile.calltrace.cycles-pp.schedule.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex
      3.69 ±  3%      +0.3        4.00 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64.__lll_lock_wait
      0.26 ±100%      +0.3        0.58 ±  6%  perf-profile.calltrace.cycles-pp.mark_wake_futex.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
      4.69            +0.4        5.05        perf-profile.calltrace.cycles-pp.futex_wait_queue_me.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
      1.49 ±  4%      +0.5        2.04 ±  6%  perf-profile.calltrace.cycles-pp.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity
      1.65 ±  4%      +0.6        2.27 ±  4%  perf-profile.calltrace.cycles-pp.stack_trace_save_tsk.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity.enqueue_task_fair
      0.00            +0.6        0.62 ±  7%  perf-profile.calltrace.cycles-pp.call_function_single_interrupt.tick_nohz_idle_enter.do_idle.cpu_startup_entry.start_secondary
      2.11 ±  5%      +0.6        2.73 ±  4%  perf-profile.calltrace.cycles-pp.update_stats_enqueue_sleeper.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate
      1.91 ±  6%      +0.7        2.62 ±  4%  perf-profile.calltrace.cycles-pp.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity.enqueue_task_fair.activate_task
      0.00            +0.7        0.72 ±  8%  perf-profile.calltrace.cycles-pp.tick_nohz_idle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      5.89 ±  2%      +0.8        6.71        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wait_setup.futex_wait.do_futex
      7.28 ±  2%      +0.9        8.22        perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex
     10.35 ±  3%      +1.2       11.52 ±  3%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     10.44 ±  3%      +1.2       11.63 ±  4%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
     10.43 ±  3%      +1.2       11.62 ±  3%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
      7.39 ±  4%      +1.2        8.58        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.futex_wake.do_futex.__x64_sys_futex
      0.00            +1.2        1.20 ±  4%  perf-profile.calltrace.cycles-pp.ttwu_queue_wakelist.try_to_wake_up.wake_up_q.futex_wake.do_futex
     12.97            +1.3       14.24        perf-profile.calltrace.cycles-pp.futex_wait_setup.futex_wait.do_futex.__x64_sys_futex.do_syscall_64
      0.00            +1.3        1.30 ± 25%  perf-profile.calltrace.cycles-pp.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry.start_kernel
      0.00            +1.3        1.32 ± 25%  perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
     12.23 ±  2%      +1.4       13.61        perf-profile.calltrace.cycles-pp.secondary_startup_64
      8.60 ±  3%      +1.4       10.00 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
     18.68            +1.7       20.36        perf-profile.calltrace.cycles-pp.futex_wait.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     18.91            +1.7       20.63        perf-profile.calltrace.cycles-pp.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_lock_wait
     19.56            +1.7       21.30        perf-profile.calltrace.cycles-pp.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_lock_wait
     25.94            +1.9       27.88        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__lll_lock_wait
     26.43            +2.0       28.40        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__lll_lock_wait
     37.00            +2.7       39.73        perf-profile.calltrace.cycles-pp.__lll_lock_wait
      0.00            +3.6        3.57 ±  5%  perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending
      0.00            +4.0        4.04 ±  3%  perf-profile.calltrace.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending.flush_smp_call_function_queue
      0.00            +4.1        4.09 ±  4%  perf-profile.calltrace.cycles-pp.activate_task.ttwu_do_activate.sched_ttwu_pending.flush_smp_call_function_queue.smp_call_function_single_interrupt
      0.00            +4.1        4.11 ±  4%  perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.flush_smp_call_function_queue.smp_call_function_single_interrupt.call_function_single_interrupt
      0.00            +4.8        4.84        perf-profile.calltrace.cycles-pp.sched_ttwu_pending.flush_smp_call_function_queue.smp_call_function_single_interrupt.call_function_single_interrupt.finish_task_switch
      0.00            +5.1        5.07        perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.smp_call_function_single_interrupt.call_function_single_interrupt.finish_task_switch.__sched_text_start
      2.52            +5.2        7.70 ±  4%  perf-profile.calltrace.cycles-pp.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry.start_secondary
      2.59            +5.3        7.85 ±  4%  perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      0.00            +5.4        5.39        perf-profile.calltrace.cycles-pp.smp_call_function_single_interrupt.call_function_single_interrupt.finish_task_switch.__sched_text_start.schedule_idle
      0.00            +6.2        6.25 ±  4%  perf-profile.calltrace.cycles-pp.call_function_single_interrupt.finish_task_switch.__sched_text_start.schedule_idle.do_idle
      0.00            +6.3        6.27        perf-profile.calltrace.cycles-pp.finish_task_switch.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry
      9.00 ±  2%      -6.5        2.52 ±  2%  perf-profile.children.cycles-pp.try_to_wake_up
      9.15 ±  2%      -6.3        2.80 ±  3%  perf-profile.children.cycles-pp.wake_up_q
     22.02            -4.6       17.42        perf-profile.children.cycles-pp.futex_wake
     40.94            -4.3       36.66        perf-profile.children.cycles-pp.__lll_unlock_wake
     41.31            -2.9       38.42        perf-profile.children.cycles-pp.do_futex
     42.74            -2.8       39.91        perf-profile.children.cycles-pp.__x64_sys_futex
      3.29 ±  3%      -2.7        0.63 ±  8%  perf-profile.children.cycles-pp.menu_select
     56.69            -2.6       54.14        perf-profile.children.cycles-pp.do_syscall_64
     57.68            -2.5       55.17        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      3.67 ± 10%      -2.2        1.50 ± 10%  perf-profile.children.cycles-pp.cpuidle_enter
      3.64 ± 10%      -2.1        1.50 ± 10%  perf-profile.children.cycles-pp.cpuidle_enter_state
      2.24 ±  4%      -1.8        0.43 ±  8%  perf-profile.children.cycles-pp.poll_idle
      1.71 ±  7%      -1.4        0.33 ± 12%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      1.35 ±  6%      -1.1        0.24 ± 12%  perf-profile.children.cycles-pp.tick_nohz_next_event
      1.09 ±  8%      -0.9        0.19 ± 14%  perf-profile.children.cycles-pp.get_next_timer_interrupt
      0.76 ±  9%      -0.6        0.15 ± 18%  perf-profile.children.cycles-pp.__next_timer_interrupt
      1.91            -0.5        1.40 ±  3%  perf-profile.children.cycles-pp.pick_next_task_fair
      0.52 ±  6%      -0.4        0.08 ±  5%  perf-profile.children.cycles-pp.select_task_rq_fair
      1.21 ±  2%      -0.4        0.79 ±  3%  perf-profile.children.cycles-pp.set_next_entity
      1.73 ±  2%      -0.4        1.35 ±  3%  perf-profile.children.cycles-pp.update_load_avg
      0.43 ± 13%      -0.3        0.08 ± 17%  perf-profile.children.cycles-pp._find_next_bit
      0.36 ±  7%      -0.3        0.10 ±  7%  perf-profile.children.cycles-pp.tick_nohz_idle_exit
      0.38 ±  3%      -0.2        0.14 ±  8%  perf-profile.children.cycles-pp.ktime_get
      0.26 ± 11%      -0.2        0.05 ± 58%  perf-profile.children.cycles-pp.hrtimer_next_event_without
      0.29 ±  3%      -0.2        0.11 ±  8%  perf-profile.children.cycles-pp.read_tsc
      0.61 ±  2%      -0.2        0.43 ±  3%  perf-profile.children.cycles-pp.__update_load_avg_cfs_rq
      0.40 ± 11%      -0.2        0.22 ±  5%  perf-profile.children.cycles-pp.check_preempt_curr
      0.44 ±  9%      -0.2        0.27 ±  5%  perf-profile.children.cycles-pp.ttwu_do_wakeup
      0.44 ±  4%      -0.1        0.31 ±  3%  perf-profile.children.cycles-pp.sched_clock_cpu
      0.45 ±  4%      -0.1        0.33 ±  3%  perf-profile.children.cycles-pp.update_rq_clock
      0.21 ± 14%      -0.1        0.11 ±  4%  perf-profile.children.cycles-pp.resched_curr
      0.39 ±  5%      -0.1        0.28 ±  5%  perf-profile.children.cycles-pp.sched_clock
      0.38 ±  5%      -0.1        0.27 ±  6%  perf-profile.children.cycles-pp.native_sched_clock
      0.21 ±  6%      -0.1        0.12 ± 10%  perf-profile.children.cycles-pp.__list_del_entry_valid
      0.24 ±  3%      -0.1        0.16 ±  5%  perf-profile.children.cycles-pp.pick_next_entity
      0.15 ±  5%      -0.1        0.09 ± 15%  perf-profile.children.cycles-pp.place_entity
      0.20 ± 12%      -0.0        0.15 ±  4%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      0.41 ±  5%      -0.0        0.37 ±  4%  perf-profile.children.cycles-pp.___perf_sw_event
      0.05 ±  8%      +0.0        0.07 ± 12%  perf-profile.children.cycles-pp.__enqueue_entity
      0.18 ±  6%      +0.0        0.21 ±  3%  perf-profile.children.cycles-pp.put_prev_task_fair
      0.12 ±  8%      +0.0        0.16 ±  7%  perf-profile.children.cycles-pp.__calc_delta
      0.12 ±  3%      +0.0        0.15 ±  9%  perf-profile.children.cycles-pp.switch_fpu_return
      0.04 ± 57%      +0.0        0.08 ± 12%  perf-profile.children.cycles-pp.rb_erase
      0.12 ± 17%      +0.1        0.17 ± 12%  perf-profile.children.cycles-pp.pick_next_task_idle
      0.00            +0.1        0.06 ±  7%  perf-profile.children.cycles-pp.rcu_irq_enter
      0.00            +0.1        0.06        perf-profile.children.cycles-pp.wake_csd_func
      0.00            +0.1        0.06 ±  6%  perf-profile.children.cycles-pp.irq_work_run
      0.00            +0.1        0.06 ±  6%  perf-profile.children.cycles-pp.is_bpf_text_address
      0.00            +0.1        0.06 ± 13%  perf-profile.children.cycles-pp.is_module_text_address
      0.00            +0.1        0.07 ±  7%  perf-profile.children.cycles-pp.in_lock_functions
      0.51 ±  4%      +0.1        0.58 ±  6%  perf-profile.children.cycles-pp.mark_wake_futex
      0.00            +0.1        0.07 ± 17%  perf-profile.children.cycles-pp.put_task_stack
      0.00            +0.1        0.08 ± 11%  perf-profile.children.cycles-pp.ftrace_graph_ret_addr
      0.00            +0.1        0.08 ± 14%  perf-profile.children.cycles-pp.irq_work_run_list
      0.00            +0.1        0.08 ± 21%  perf-profile.children.cycles-pp.get_stack_info
      1.51 ±  2%      +0.1        1.59 ±  3%  perf-profile.children.cycles-pp.hash_futex
      0.00            +0.1        0.09 ± 10%  perf-profile.children.cycles-pp.__default_send_IPI_dest_field
      0.04 ± 57%      +0.1        0.13 ± 10%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rbx
      0.00            +0.1        0.09 ±  7%  perf-profile.children.cycles-pp.native_apic_mem_write
      0.10 ± 12%      +0.1        0.20 ±  9%  perf-profile.children.cycles-pp.__unwind_start
      0.42 ±  4%      +0.1        0.52 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.09 ±  9%      +0.1        0.21 ±  3%  perf-profile.children.cycles-pp.in_sched_functions
      0.60            +0.1        0.73 ±  3%  perf-profile.children.cycles-pp.__switch_to
      0.00            +0.1        0.13 ±  7%  perf-profile.children.cycles-pp.tick_irq_enter
      0.08 ± 10%      +0.1        0.23 ±  3%  perf-profile.children.cycles-pp.stack_access_ok
      0.00            +0.2        0.15 ±  3%  perf-profile.children.cycles-pp.interrupt_entry
      0.00            +0.2        0.17 ±  6%  perf-profile.children.cycles-pp._flat_send_IPI_mask
      0.10 ±  8%      +0.2        0.29 ± 10%  perf-profile.children.cycles-pp.orc_find
      0.15 ±  5%      +0.2        0.39 ±  4%  perf-profile.children.cycles-pp.kernel_text_address
      0.00            +0.2        0.25 ±  5%  perf-profile.children.cycles-pp.irq_enter
      0.18 ±  7%      +0.3        0.46 ±  8%  perf-profile.children.cycles-pp.__orc_find
      0.18 ±  7%      +0.3        0.46 ±  4%  perf-profile.children.cycles-pp.__kernel_text_address
      4.27            +0.3        4.57        perf-profile.children.cycles-pp.schedule
      0.23 ±  8%      +0.3        0.56 ±  3%  perf-profile.children.cycles-pp.unwind_get_return_address
      0.07 ± 22%      +0.3        0.41 ±  4%  perf-profile.children.cycles-pp.native_irq_return_iret
      4.70            +0.4        5.05        perf-profile.children.cycles-pp.futex_wait_queue_me
      0.27 ±  5%      +0.4        0.69 ±  2%  perf-profile.children.cycles-pp.stack_trace_consume_entry_nosched
      0.00            +0.4        0.42 ±  3%  perf-profile.children.cycles-pp.generic_exec_single
      0.00            +0.5        0.46 ±  3%  perf-profile.children.cycles-pp.smp_call_function_single_async
      8.04 ±  2%      +0.5        8.54 ±  2%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.00            +0.5        0.51 ±  4%  perf-profile.children.cycles-pp.llist_add_batch
      0.35 ±  4%      +0.5        0.90 ±  4%  perf-profile.children.cycles-pp.tick_nohz_idle_enter
      0.88 ±  4%      +1.1        1.93 ±  4%  perf-profile.children.cycles-pp.unwind_next_frame
     10.44 ±  3%      +1.2       11.63 ±  4%  perf-profile.children.cycles-pp.start_secondary
      0.00            +1.2        1.21 ±  4%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
     13.04            +1.3       14.31        perf-profile.children.cycles-pp.futex_wait_setup
     12.23 ±  2%      +1.3       13.53 ±  2%  perf-profile.children.cycles-pp.do_idle
     12.23 ±  2%      +1.4       13.61        perf-profile.children.cycles-pp.secondary_startup_64
     12.23 ±  2%      +1.4       13.61        perf-profile.children.cycles-pp.cpu_startup_entry
     18.69            +1.7       20.38        perf-profile.children.cycles-pp.futex_wait
      4.22 ±  4%      +1.9        6.12 ±  2%  perf-profile.children.cycles-pp.enqueue_task_fair
      4.25 ±  4%      +1.9        6.17 ±  2%  perf-profile.children.cycles-pp.activate_task
      4.27 ±  4%      +1.9        6.21 ±  2%  perf-profile.children.cycles-pp.ttwu_do_activate
      1.52 ±  5%      +2.0        3.49 ±  3%  perf-profile.children.cycles-pp.arch_stack_walk
      1.66 ±  4%      +2.0        3.64 ±  3%  perf-profile.children.cycles-pp.stack_trace_save_tsk
      3.77 ±  4%      +2.1        5.83 ±  3%  perf-profile.children.cycles-pp.enqueue_entity
     13.37 ±  3%      +2.1       15.42        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     16.48 ±  2%      +2.1       18.57        perf-profile.children.cycles-pp._raw_spin_lock
      2.12 ±  5%      +2.3        4.40 ±  4%  perf-profile.children.cycles-pp.update_stats_enqueue_sleeper
      1.93 ±  5%      +2.3        4.21 ±  4%  perf-profile.children.cycles-pp.__account_scheduler_latency
     37.42            +2.8       40.19        perf-profile.children.cycles-pp.__lll_lock_wait
      3.04            +6.1        9.19        perf-profile.children.cycles-pp.schedule_idle
      7.15            +6.3       13.47        perf-profile.children.cycles-pp.__sched_text_start
      0.00            +6.5        6.47        perf-profile.children.cycles-pp.flush_smp_call_function_queue
      0.10 ± 12%      +6.5        6.58        perf-profile.children.cycles-pp.sched_ttwu_pending
      0.46 ±  3%      +6.7        7.18        perf-profile.children.cycles-pp.finish_task_switch
      0.00            +6.9        6.89        perf-profile.children.cycles-pp.smp_call_function_single_interrupt
      0.00            +7.8        7.82        perf-profile.children.cycles-pp.call_function_single_interrupt
      2.84            -2.4        0.41        perf-profile.self.cycles-pp.try_to_wake_up
      2.11 ±  4%      -1.8        0.27 ± 13%  perf-profile.self.cycles-pp.poll_idle
      1.20 ±  3%      -1.0        0.22 ± 10%  perf-profile.self.cycles-pp.menu_select
      0.42 ± 13%      -0.3        0.08 ± 14%  perf-profile.self.cycles-pp._find_next_bit
      0.50 ±  4%      -0.3        0.21 ±  9%  perf-profile.self.cycles-pp.do_idle
      0.30 ±  9%      -0.2        0.05 ± 60%  perf-profile.self.cycles-pp.__next_timer_interrupt
      0.56 ±  3%      -0.2        0.32 ±  5%  perf-profile.self.cycles-pp.set_next_entity
      0.60 ±  2%      -0.2        0.41 ±  5%  perf-profile.self.cycles-pp.__update_load_avg_cfs_rq
      0.27 ±  4%      -0.2        0.10 ±  8%  perf-profile.self.cycles-pp.read_tsc
      0.67 ±  4%      -0.2        0.50 ±  5%  perf-profile.self.cycles-pp.update_load_avg
      0.45 ±  6%      -0.2        0.29 ±  5%  perf-profile.self.cycles-pp.enqueue_task_fair
      0.29 ±  7%      -0.1        0.14 ±  5%  perf-profile.self.cycles-pp.update_rq_clock
      0.21 ± 15%      -0.1        0.11 ±  4%  perf-profile.self.cycles-pp.resched_curr
      0.21 ±  5%      -0.1        0.11 ±  3%  perf-profile.self.cycles-pp.pick_next_entity
      0.36 ±  4%      -0.1        0.26 ±  6%  perf-profile.self.cycles-pp.native_sched_clock
      0.21 ±  6%      -0.1        0.11 ±  7%  perf-profile.self.cycles-pp.__list_del_entry_valid
      0.11 ±  3%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.cpuidle_enter_state
      0.28 ±  3%      -0.1        0.22 ±  5%  perf-profile.self.cycles-pp.dequeue_task_fair
      0.14 ± 11%      -0.1        0.08 ± 15%  perf-profile.self.cycles-pp.place_entity
      0.37 ±  6%      -0.0        0.33 ±  3%  perf-profile.self.cycles-pp.___perf_sw_event
      0.11 ±  4%      -0.0        0.08 ± 14%  perf-profile.self.cycles-pp.stack_trace_save_tsk
      0.30 ±  4%      -0.0        0.27 ±  6%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
      0.16 ±  9%      -0.0        0.13        perf-profile.self.cycles-pp.account_entity_dequeue
      0.08 ±  8%      +0.0        0.10 ±  4%  perf-profile.self.cycles-pp.put_prev_task_fair
      0.07 ± 10%      +0.0        0.10 ±  5%  perf-profile.self.cycles-pp.schedule_idle
      0.18 ±  5%      +0.0        0.21 ±  7%  perf-profile.self.cycles-pp.dequeue_entity
      0.12 ± 12%      +0.0        0.15 ±  5%  perf-profile.self.cycles-pp.__calc_delta
      0.20 ±  6%      +0.0        0.23 ±  4%  perf-profile.self.cycles-pp.futex_wait_queue_me
      0.11 ±  4%      +0.0        0.15 ± 10%  perf-profile.self.cycles-pp.switch_fpu_return
      0.01 ±173%      +0.0        0.06 ± 14%  perf-profile.self.cycles-pp.clear_buddies
      0.25 ±  3%      +0.0        0.30 ±  2%  perf-profile.self.cycles-pp.__unqueue_futex
      0.62 ±  2%      +0.1        0.67 ±  2%  perf-profile.self.cycles-pp.do_futex
      0.00            +0.1        0.06 ± 14%  perf-profile.self.cycles-pp.tick_irq_enter
      0.00            +0.1        0.06 ±  7%  perf-profile.self.cycles-pp.wake_csd_func
      0.11 ±  4%      +0.1        0.17 ±  3%  perf-profile.self.cycles-pp.pick_next_task_fair
      0.04 ± 58%      +0.1        0.10 ±  7%  perf-profile.self.cycles-pp.unwind_get_return_address
      0.01 ±173%      +0.1        0.07 ± 11%  perf-profile.self.cycles-pp.rb_erase
      0.00            +0.1        0.06 ±  6%  perf-profile.self.cycles-pp.ftrace_graph_ret_addr
      0.01 ±173%      +0.1        0.08 ± 10%  perf-profile.self.cycles-pp.tick_nohz_idle_enter
      0.00            +0.1        0.07 ± 22%  perf-profile.self.cycles-pp.put_task_stack
      0.00            +0.1        0.08 ±  8%  perf-profile.self.cycles-pp.__kernel_text_address
      0.00            +0.1        0.08 ±  8%  perf-profile.self.cycles-pp._flat_send_IPI_mask
      1.47 ±  2%      +0.1        1.56 ±  2%  perf-profile.self.cycles-pp.hash_futex
      0.00            +0.1        0.08 ±  5%  perf-profile.self.cycles-pp.native_apic_mem_write
      0.00            +0.1        0.09 ± 10%  perf-profile.self.cycles-pp.__default_send_IPI_dest_field
      0.06 ± 11%      +0.1        0.15 ±  7%  perf-profile.self.cycles-pp.in_sched_functions
      0.40 ±  5%      +0.1        0.49 ±  3%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.01 ±173%      +0.1        0.10 ±  8%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rbx
      0.05 ±  9%      +0.1        0.15 ±  9%  perf-profile.self.cycles-pp.kernel_text_address
      0.17 ±  8%      +0.1        0.28 ±  5%  perf-profile.self.cycles-pp.wake_up_q
      0.04 ±100%      +0.1        0.15 ±  7%  perf-profile.self.cycles-pp.arch_stack_walk
      0.07 ±  5%      +0.1        0.20 ±  4%  perf-profile.self.cycles-pp.stack_access_ok
      0.58            +0.1        0.71 ±  3%  perf-profile.self.cycles-pp.__switch_to
      0.00            +0.1        0.15 ±  5%  perf-profile.self.cycles-pp.call_function_single_interrupt
      0.00            +0.2        0.15 ±  2%  perf-profile.self.cycles-pp.interrupt_entry
      0.10 ±  8%      +0.2        0.27 ±  9%  perf-profile.self.cycles-pp.orc_find
      0.18 ± 18%      +0.2        0.38 ± 13%  perf-profile.self.cycles-pp.__account_scheduler_latency
      0.00            +0.2        0.21 ±  2%  perf-profile.self.cycles-pp.flush_smp_call_function_queue
      2.46 ±  2%      +0.2        2.69 ±  5%  perf-profile.self.cycles-pp.futex_wake
      2.80            +0.2        3.02        perf-profile.self.cycles-pp.__lll_lock_wait
      0.09 ± 17%      +0.2        0.32        perf-profile.self.cycles-pp.sched_ttwu_pending
      3.20 ±  2%      +0.2        3.44 ±  3%  perf-profile.self.cycles-pp.futex_wait_setup
      0.18 ±  7%      +0.3        0.45 ±  9%  perf-profile.self.cycles-pp.__orc_find
      0.17 ±  8%      +0.3        0.45 ±  3%  perf-profile.self.cycles-pp.stack_trace_consume_entry_nosched
      0.07 ± 22%      +0.3        0.41 ±  5%  perf-profile.self.cycles-pp.native_irq_return_iret
      0.00            +0.3        0.34 ± 13%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
      0.51 ±  5%      +0.4        0.93 ±  4%  perf-profile.self.cycles-pp.unwind_next_frame
      7.17 ±  3%      +0.4        7.60 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.00            +0.5        0.51 ±  4%  perf-profile.self.cycles-pp.llist_add_batch
      0.40 ±  2%      +0.7        1.05 ±  3%  perf-profile.self.cycles-pp.finish_task_switch
     13.28 ±  3%      +2.1       15.33        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath


                                                                                
                             will-it-scale.per_thread_ops                       
                                                                                
   1.6e+06 +----------------------------------------------------------------+   
           |                                                                |   
  1.58e+06 |.++.+. +    +. +.+. +.+ .+.+ .+.     .+. +.+.++                 |   
           | +    + +  +  +    +   +    +   ++.++   +      +    .++   ++    |   
  1.56e+06 |-+       ++                                     ++.+   + +      |   
           |                                                        +       |   
  1.54e+06 |-+                                                              |   
           |                                                                |   
  1.52e+06 |-+                                                              |   
           |                                                           O  O |   
   1.5e+06 |-O  O  O           O     O      O  O                  O      O  |   
           |  O   O   O   O                     O      O O  O  O O          |   
  1.48e+06 |-+       O  O  O O  O OO   OO O  O      OO              O O     |   
           |                                      O       O  O              |   
  1.46e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
  1.27e+07 +----------------------------------------------------------------+   
  1.26e+07 |.+: +. +    +. : +. +.+ +  + .+. +.  .+.++.+.++      +          |   
           | +    + +  :  +    +   +    +   +  ++          +    : +   ++    |   
  1.25e+07 |-+       + :                                    ++. :  + +      |   
  1.24e+07 |-+        +                                        +    +       |   
           |                                                                |   
  1.23e+07 |-+                                                              |   
  1.22e+07 |-+                                                              |   
  1.21e+07 |-+                                                              |   
           |                                                           O  O |   
   1.2e+07 |-O  O  O      O    O     O      O  O                  O      O  |   
  1.19e+07 |-+O   O   O                         O      O O  O  O O          |   
           |         O  O  O O  O OO   OO O  O      OO              O O     |   
  1.18e+07 |-+                                    O       O  O              |   
  1.17e+07 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-bdw-de1: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase/ucode:
  gcc-9/performance/x86_64-rhel-7.6/development/50%/debian-x86_64-20191114.cgz/lkp-bdw-de1/UNIX/50%/lmbench3/0x7000019

commit: 
  c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
  2ebb177175 ("sched/core: Offload wakee task activation if it the wakee is descheduling")

c6e7bd7afaeb3af5 2ebb17717550607bcd85fb8cf7d 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     27854            +5.1%      29272        lmbench3.AF_UNIX.sock.stream.bandwidth.MB/sec
     21.14 ±  2%      -9.2%      19.21        lmbench3.AF_UNIX.sock.stream.latency.us
    171.33 ± 57%     -37.4%     107.18        lmbench3.time.elapsed_time
    171.33 ± 57%     -37.4%     107.18        lmbench3.time.elapsed_time.max
      1099           -12.4%     963.47        lmbench3.time.system_time
    114.18 ± 82%     -51.3%      55.56 ±  3%  lmbench3.time.user_time
  37759348 ±  8%     -37.8%   23471465        lmbench3.time.voluntary_context_switches
     19.59 ±  3%      -2.8%      19.04        boot-time.dhcp
     36767 ±  5%    +227.1%     120272 ±  5%  vmstat.system.in
      8806 ± 16%     -19.3%       7103        slabinfo.kmalloc-32.active_objs
      8806 ± 16%     -19.3%       7103        slabinfo.kmalloc-32.num_objs
  14571053 ±  7%     +72.1%   25080410 ±  3%  cpuidle.C1.time
   3733909 ±  5%     +28.5%    4798750 ±  2%  cpuidle.C1.usage
  7.71e+08 ± 92%     -84.0%  1.236e+08 ± 83%  cpuidle.C1E.time
   2313799 ± 56%    +115.6%    4988567 ±  7%  cpuidle.C1E.usage
  59806456 ±  6%     -71.0%   17318399 ±  2%  cpuidle.POLL.time
  27941426 ± 10%     -76.8%    6496055 ±  2%  cpuidle.POLL.usage
      4512 ±  4%      -7.9%       4156        proc-vmstat.nr_shmem
  96678375 ±  2%      +3.1%   99643960        proc-vmstat.numa_hit
  96678375 ±  2%      +3.1%   99643960        proc-vmstat.numa_local
      3425 ± 47%     -96.0%     136.50 ±  5%  proc-vmstat.pgactivate
 3.712e+08 ±  2%      +2.9%   3.82e+08        proc-vmstat.pgalloc_normal
    573423 ± 20%     -14.3%     491235        proc-vmstat.pgfault
 3.712e+08 ±  2%      +2.9%  3.819e+08        proc-vmstat.pgfree
    111.33 ± 11%     +10.2%     122.63        perf-stat.i.MPKI
   1018422 ± 31%     +83.8%    1871714        perf-stat.i.iTLB-loads
      0.43 ±  5%      -6.3%       0.40        perf-stat.i.ipc
      3764 ± 23%     +18.0%       4442        perf-stat.i.minor-faults
      3764 ± 23%     +18.0%       4442        perf-stat.i.page-faults
     78.37 ± 10%     +16.4%      91.23        perf-stat.overall.MPKI
      2.02            +0.1        2.13        perf-stat.overall.branch-miss-rate%
      2.69 ±  3%      +9.1%       2.93        perf-stat.overall.cpi
     34.57 ±  7%      -7.1%      32.13        perf-stat.overall.cycles-between-cache-misses
     54.23 ±  2%     -12.4       41.88        perf-stat.overall.iTLB-load-miss-rate%
      7084 ±  3%     -10.6%       6331 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
      0.37 ±  3%      -8.4%       0.34        perf-stat.overall.ipc
   1012704 ± 30%     +84.6%    1869854        perf-stat.ps.iTLB-loads
      3739 ± 23%     +17.9%       4410        perf-stat.ps.minor-faults
      3739 ± 23%     +17.9%       4410        perf-stat.ps.page-faults
 1.213e+12 ± 14%     -24.1%  9.208e+11        perf-stat.total.instructions
      0.53 ± 17%     +74.2%       0.92 ± 14%  sched_debug.cfs_rq:/.nr_running.avg
      0.49 ± 16%     -40.7%       0.29 ± 34%  sched_debug.cfs_rq:/.nr_running.stddev
    620.31 ± 30%     +44.9%     898.80 ±  5%  sched_debug.cfs_rq:/.util_avg.avg
     98.08 ± 57%    +386.9%     477.56 ±  6%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    513.58 ± 21%     +88.7%     969.38 ± 17%  sched_debug.cfs_rq:/.util_est_enqueued.max
      0.04 ±173%   +1100.0%       0.50        sched_debug.cfs_rq:/.util_est_enqueued.min
    153.45 ± 29%     +68.8%     259.10 ±  8%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    475089 ± 27%     -65.4%     164406 ± 16%  sched_debug.cpu.avg_idle.avg
    961683 ±  6%     -40.6%     571662 ±  6%  sched_debug.cpu.avg_idle.max
    226966 ± 17%     -40.5%     135098 ±  7%  sched_debug.cpu.avg_idle.stddev
      0.53 ± 18%     +57.8%       0.83 ± 36%  sched_debug.cpu.clock.stddev
      0.53 ± 18%     +57.8%       0.83 ± 36%  sched_debug.cpu.clock_task.stddev
    789.47 ± 45%    +130.5%       1819 ±  6%  sched_debug.cpu.curr->pid.avg
    776.40 ± 33%     -46.4%     415.80 ± 31%  sched_debug.cpu.curr->pid.stddev
      0.58 ± 13%     +82.4%       1.05 ± 19%  sched_debug.cpu.nr_running.avg
     17.25 ± 17%     -37.7%      10.75 ± 28%  sched_debug.cpu.nr_uninterruptible.max
     41987 ± 26%     -22.9%      32387 ±  4%  softirqs.CPU0.SCHED
     38670 ± 29%     -33.1%      25858 ±  4%  softirqs.CPU1.SCHED
     12645 ±169%     -98.7%     168.75 ± 29%  softirqs.CPU11.NET_RX
     40096 ± 30%     -30.9%      27712 ±  2%  softirqs.CPU11.SCHED
     39769 ± 34%     -30.6%      27619 ±  9%  softirqs.CPU13.SCHED
     29901 ±  5%     -10.7%      26709 ±  3%  softirqs.CPU14.SCHED
     69467 ± 58%     -99.8%     149.75 ±  7%  softirqs.CPU15.NET_RX
     36370 ± 25%     -20.3%      28988 ±  9%  softirqs.CPU15.SCHED
     92873 ± 50%     -53.2%      43426 ±  4%  softirqs.CPU15.TIMER
     39843 ± 31%     -31.7%      27217 ±  3%  softirqs.CPU2.SCHED
     73109 ± 61%     -40.6%      43398 ±  6%  softirqs.CPU3.TIMER
     30721 ± 12%     -43.3%      17409 ± 59%  softirqs.CPU4.RCU
     38887 ± 35%     -33.0%      26069 ±  4%  softirqs.CPU4.SCHED
     39018 ± 34%     -29.2%      27637 ±  6%  softirqs.CPU5.SCHED
     34819 ± 21%     -21.6%      27297 ±  5%  softirqs.CPU7.SCHED
    608506 ± 30%     -26.8%     445287        softirqs.SCHED
     24964 ±172%     -99.7%      75.25 ± 36%  interrupts.37:IR-PCI-MSI.2621444-edge.eth0-TxRx-3
    103205 ± 59%     -99.9%      65.50 ±  2%  interrupts.41:IR-PCI-MSI.2621448-edge.eth0-TxRx-7
      3280 ± 26%    +3e+05%    9751612        interrupts.CAL:Function_call_interrupts
      1014 ± 68%  +59535.1%     604699 ±  3%  interrupts.CPU0.CAL:Function_call_interrupts
     23141 ±  5%     -34.4%      15174 ±  6%  interrupts.CPU0.RES:Rescheduling_interrupts
    234.25 ± 68%  +2.4e+05%     558579 ± 13%  interrupts.CPU1.CAL:Function_call_interrupts
     22907 ±  7%     -39.0%      13965 ± 11%  interrupts.CPU1.RES:Rescheduling_interrupts
    109.50 ±  9%  +5.3e+05%     580959 ± 10%  interrupts.CPU10.CAL:Function_call_interrupts
     24940 ± 13%     -42.4%      14376 ±  9%  interrupts.CPU10.RES:Rescheduling_interrupts
     24964 ±172%     -99.7%      75.25 ± 36%  interrupts.CPU11.37:IR-PCI-MSI.2621444-edge.eth0-TxRx-3
    106.25 ± 17%  +5.5e+05%     583288 ± 17%  interrupts.CPU11.CAL:Function_call_interrupts
     28683 ± 19%     -51.4%      13942 ± 18%  interrupts.CPU11.RES:Rescheduling_interrupts
    116.25 ± 10%  +5.3e+05%     613414        interrupts.CPU12.CAL:Function_call_interrupts
     25330 ±  5%     -33.5%      16848 ± 26%  interrupts.CPU12.RES:Rescheduling_interrupts
    117.75 ± 10%  +5.2e+05%     616142 ±  2%  interrupts.CPU13.CAL:Function_call_interrupts
     23262 ±  7%     -39.1%      14161 ±  9%  interrupts.CPU13.RES:Rescheduling_interrupts
    107.75 ± 22%  +5.8e+05%     620186 ±  3%  interrupts.CPU14.CAL:Function_call_interrupts
     20402 ±  8%     -26.9%      14918 ±  6%  interrupts.CPU14.RES:Rescheduling_interrupts
    103205 ± 59%     -99.9%      65.50 ±  2%  interrupts.CPU15.41:IR-PCI-MSI.2621448-edge.eth0-TxRx-7
    115.25 ±  3%  +5.2e+05%     600664 ±  5%  interrupts.CPU15.CAL:Function_call_interrupts
     46830 ± 52%     -69.3%      14377 ± 15%  interrupts.CPU15.RES:Rescheduling_interrupts
    355.00 ± 73%  +1.8e+05%     626144 ±  4%  interrupts.CPU2.CAL:Function_call_interrupts
     23992 ±  7%     -38.2%      14815 ±  4%  interrupts.CPU2.RES:Rescheduling_interrupts
    108.75 ±  8%  +5.7e+05%     624218        interrupts.CPU3.CAL:Function_call_interrupts
     24539 ±  3%     -38.0%      15218 ±  4%  interrupts.CPU3.RES:Rescheduling_interrupts
    108.75 ± 17%  +5.5e+05%     602705 ±  5%  interrupts.CPU4.CAL:Function_call_interrupts
     23504 ±  3%     -36.3%      14967 ±  8%  interrupts.CPU4.RES:Rescheduling_interrupts
    353.25 ±127%  +1.8e+05%     623711 ±  5%  interrupts.CPU5.CAL:Function_call_interrupts
     24063 ±  9%     -36.8%      15218 ±  5%  interrupts.CPU5.RES:Rescheduling_interrupts
    125.00 ± 52%    +5e+05%     624097        interrupts.CPU6.CAL:Function_call_interrupts
    102.25 ±  6%  +6.1e+05%     627198 ±  3%  interrupts.CPU7.CAL:Function_call_interrupts
     25608 ±  8%     -42.6%      14697 ±  7%  interrupts.CPU7.RES:Rescheduling_interrupts
    105.00 ±  5%  +5.9e+05%     623090 ±  5%  interrupts.CPU8.CAL:Function_call_interrupts
     24989 ±  6%     -34.9%      16266 ± 14%  interrupts.CPU8.RES:Rescheduling_interrupts
    101.50 ±  7%  +6.1e+05%     622512 ±  2%  interrupts.CPU9.CAL:Function_call_interrupts
     24250 ± 15%     -39.9%      14570 ±  3%  interrupts.CPU9.RES:Rescheduling_interrupts
    407214 ±  4%     -41.4%     238607 ±  6%  interrupts.RES:Rescheduling_interrupts
      0.43 ± 57%      +0.2        0.61 ±  3%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      0.45 ± 57%      +0.2        0.64        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
      0.00            +0.6        0.62 ±  4%  perf-profile.calltrace.cycles-pp.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry.start_secondary
      0.00            +0.6        0.64 ±  5%  perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      1.85 ±129%      -1.5        0.40 ± 13%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
      1.07 ±100%      -0.9        0.18 ±  7%  perf-profile.children.cycles-pp.menu_select
      0.72 ±115%      -0.6        0.15 ± 31%  perf-profile.children.cycles-pp.start_kernel
      0.65 ±106%      -0.6        0.09        perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.52 ±111%      -0.5        0.06 ±  6%  perf-profile.children.cycles-pp.tick_nohz_next_event
      0.43 ±109%      -0.4        0.05 ±  8%  perf-profile.children.cycles-pp.get_next_timer_interrupt
      0.44 ±119%      -0.3        0.09 ± 42%  perf-profile.children.cycles-pp.irq_exit
      0.36 ±109%      -0.3        0.08 ± 48%  perf-profile.children.cycles-pp.__softirqentry_text_start
      0.11 ± 49%      -0.1        0.04 ± 58%  perf-profile.children.cycles-pp.sched_clock_cpu
      0.04 ± 58%      +0.0        0.07 ± 14%  perf-profile.children.cycles-pp.tick_nohz_idle_enter
      0.20 ± 57%      +0.1        0.28 ±  2%  perf-profile.children.cycles-pp.__check_heap_object
      0.43 ± 57%      +0.2        0.61 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.51 ± 57%      +0.2        0.73 ±  2%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.02 ±173%      +0.3        0.31 ± 63%  perf-profile.children.cycles-pp.write
      0.00            +0.3        0.30 ± 64%  perf-profile.children.cycles-pp._fini
      0.00            +0.3        0.30 ± 64%  perf-profile.children.cycles-pp.devkmsg_write.cold
      0.00            +0.3        0.30 ± 64%  perf-profile.children.cycles-pp.devkmsg_emit
      0.31 ± 58%      +0.4        0.68 ±  4%  perf-profile.children.cycles-pp.schedule_idle
      0.17 ±115%      +0.4        0.60 ± 37%  perf-profile.children.cycles-pp.ret_from_fork
      0.00            +0.4        0.43 ±  5%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
      0.17 ±118%      +0.4        0.60 ± 37%  perf-profile.children.cycles-pp.kthread
      0.00            +0.4        0.44 ±  5%  perf-profile.children.cycles-pp.sched_ttwu_pending
      0.01 ±173%      +0.4        0.46 ±  5%  perf-profile.children.cycles-pp.finish_task_switch
      0.14 ±152%      +0.5        0.60 ± 37%  perf-profile.children.cycles-pp.worker_thread
      0.13 ±152%      +0.5        0.60 ± 37%  perf-profile.children.cycles-pp.process_one_work
      0.00            +0.5        0.46 ±  6%  perf-profile.children.cycles-pp.smp_call_function_single_interrupt
      0.12 ±150%      +0.5        0.59 ± 38%  perf-profile.children.cycles-pp.drm_fb_helper_dirty_work
      0.12 ±150%      +0.5        0.60 ± 37%  perf-profile.children.cycles-pp.memcpy_erms
      0.00            +0.5        0.53 ±  5%  perf-profile.children.cycles-pp.call_function_single_interrupt
     30.52 ± 57%     +10.9       41.47        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      0.31 ± 88%      -0.2        0.06 ± 11%  perf-profile.self.cycles-pp.menu_select
      0.00            +0.1        0.07 ± 10%  perf-profile.self.cycles-pp.finish_task_switch
      0.18 ± 57%      +0.1        0.25 ±  3%  perf-profile.self.cycles-pp.__alloc_skb
      0.19 ± 57%      +0.1        0.27 ±  3%  perf-profile.self.cycles-pp.__check_heap_object
      0.43 ± 57%      +0.2        0.61 ±  3%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.51 ± 57%      +0.2        0.73 ±  2%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.12 ±150%      +0.5        0.59 ± 37%  perf-profile.self.cycles-pp.memcpy_erms
     30.30 ± 57%     +10.9       41.22        perf-profile.self.cycles-pp.copy_user_enhanced_fast_string



***************************************************************************************************
lkp-cfl-e1: 16 threads Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz with 32G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-7.6/debian-x86_64-20191114.cgz/300s/2T/lkp-cfl-e1/shm-pread-seq-mt/vm-scalability/0xca

commit: 
  c6e7bd7afa ("sched/core: Optimize ttwu() spinning on p->on_cpu")
  2ebb177175 ("sched/core: Offload wakee task activation if it the wakee is descheduling")

c6e7bd7afaeb3af5 2ebb17717550607bcd85fb8cf7d 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          4:4           59%           7:4     perf-profile.calltrace.cycles-pp.sync_regs.error_entry
          5:4           65%           7:4     perf-profile.calltrace.cycles-pp.error_entry
          0:4            2%           0:4     perf-profile.children.cycles-pp.error_exit
          6:4           42%           8:4     perf-profile.children.cycles-pp.error_entry
          0:4            5%           1:4     perf-profile.self.cycles-pp.error_entry
         %stddev     %change         %stddev
             \          |                \  
      1.33 ±  4%     +11.2%       1.48 ±  3%  vm-scalability.free_time
    851611            +4.6%     890826        vm-scalability.median
  13614520            +4.6%   14242306        vm-scalability.throughput
    200.41            -3.5%     193.31        vm-scalability.time.elapsed_time
    200.41            -3.5%     193.31        vm-scalability.time.elapsed_time.max
      1404            -9.2%       1275        vm-scalability.time.system_time
  45736309            -3.2%   44254910        vm-scalability.time.voluntary_context_switches
     11.69            -1.1%      11.56        boot-time.dhcp
      1345           -75.6%     328.25 ±173%  meminfo.Mlocked
      0.03 ±  8%      -0.0        0.02        mpstat.cpu.all.iowait%
      0.00 ± 47%      +0.0        0.01 ± 33%  mpstat.cpu.all.soft%
     37.00            +6.1%      39.25        vmstat.cpu.us
     38400          +173.5%     105030        vmstat.system.in
    473214            +5.6%     499701 ±  2%  proc-vmstat.nr_active_anon
    336.00           -75.6%      82.00 ±173%  proc-vmstat.nr_mlock
    473214            +5.6%     499701 ±  2%  proc-vmstat.nr_zone_active_anon
  31263826 ± 13%     -79.6%    6387902 ± 27%  cpuidle.C1E.time
  13122053 ± 62%     -96.8%     423713 ± 26%  cpuidle.C3.time
    164213 ± 76%     -98.9%       1852 ±  7%  cpuidle.C3.usage
 3.522e+08 ±  3%     -68.4%  1.114e+08 ±  8%  cpuidle.C6.time
    393571 ±  2%     -69.8%     118841 ±  9%  cpuidle.C6.usage
  15947522 ± 85%   +1925.1%   3.23e+08        cpuidle.C8.time
     17123 ± 85%   +1808.7%     326826        cpuidle.C8.usage
 1.433e+08           -15.2%  1.215e+08        cpuidle.POLL.time
  44000720           -32.7%   29624016        cpuidle.POLL.usage
    413.56 ±  4%     +36.0%     562.25 ± 21%  sched_debug.cfs_rq:/.load_avg.max
     34.44 ± 18%     -37.4%      21.56 ± 12%  sched_debug.cfs_rq:/.load_avg.min
    107.86 ±  7%     +49.8%     161.61 ± 13%  sched_debug.cfs_rq:/.load_avg.stddev
      0.88 ±  3%     -22.1%       0.68 ±  4%  sched_debug.cfs_rq:/.nr_running.avg
      1570 ±  3%     -11.8%       1384 ±  4%  sched_debug.cfs_rq:/.runnable_avg.avg
      3386 ±  8%     -14.9%       2883 ±  8%  sched_debug.cfs_rq:/.runnable_avg.max
    755.62 ± 14%     -31.4%     518.62 ± 20%  sched_debug.cfs_rq:/.runnable_avg.min
    874.96 ±  3%     -22.8%     675.27 ±  3%  sched_debug.cfs_rq:/.util_avg.avg
      1603 ±  7%     -14.2%       1375 ±  7%  sched_debug.cfs_rq:/.util_avg.max
    327.44 ± 23%     -46.9%     174.00 ± 36%  sched_debug.cfs_rq:/.util_avg.min
    631.63 ±  2%     -32.3%     427.44 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.avg
      1299 ± 10%     -19.9%       1040 ±  3%  sched_debug.cfs_rq:/.util_est_enqueued.max
    298.71 ±  9%     -17.5%     246.55 ±  8%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
     48852 ± 12%    +465.2%     276109 ±  3%  sched_debug.cpu.avg_idle.avg
    299441 ± 23%     +71.3%     513046 ± 12%  sched_debug.cpu.avg_idle.max
      6887 ± 54%   +2540.0%     181823 ±  5%  sched_debug.cpu.avg_idle.min
      2043           -23.5%       1563 ±  8%  sched_debug.cpu.curr->pid.avg
      0.95 ±  5%     -20.5%       0.76 ±  2%  sched_debug.cpu.nr_running.avg
      3142 ±  6%     -29.6%       2212 ±  4%  sched_debug.cpu.ttwu_local.avg
      3847 ±  4%     -26.8%       2818 ±  6%  sched_debug.cpu.ttwu_local.max
      2095 ±  4%     -37.4%       1311 ± 12%  sched_debug.cpu.ttwu_local.min
    452.04 ±  3%     -15.4%     382.27 ±  9%  sched_debug.cpu.ttwu_local.stddev
     23.47            -4.9%      22.33        perf-stat.i.MPKI
 1.466e+10            +3.2%  1.512e+10        perf-stat.i.branch-instructions
      0.27            -0.0        0.26        perf-stat.i.branch-miss-rate%
  35079947            -3.3%   33932860        perf-stat.i.branch-misses
      4.87 ±  2%      +0.5        5.34        perf-stat.i.cache-miss-rate%
  22324938            +7.2%   23942996 ±  2%  perf-stat.i.cache-misses
 1.159e+09            -2.0%  1.135e+09        perf-stat.i.cache-references
      1.14            -4.5%       1.09        perf-stat.i.cpi
 5.678e+10            -1.3%  5.604e+10        perf-stat.i.cpu-cycles
    483.63 ±  7%     -31.2%     332.90 ±  4%  perf-stat.i.cpu-migrations
      3178            -9.1%       2888 ±  2%  perf-stat.i.cycles-between-cache-misses
   8390702            +3.2%    8663324        perf-stat.i.dTLB-load-misses
 1.384e+10            +3.1%  1.427e+10        perf-stat.i.dTLB-loads
   1861136            +2.8%    1912345        perf-stat.i.dTLB-store-misses
 2.847e+09            +2.6%   2.92e+09        perf-stat.i.dTLB-stores
   5273728            -7.3%    4886534        perf-stat.i.iTLB-load-misses
    300994           +11.6%     335774        perf-stat.i.iTLB-loads
 4.801e+10            +3.1%  4.947e+10        perf-stat.i.instructions
     12131 ±  3%     +19.0%      14438        perf-stat.i.instructions-per-iTLB-miss
      0.91            +4.7%       0.95        perf-stat.i.ipc
      3.55            -1.3%       3.50        perf-stat.i.metric.GHz
      0.06 ±  6%     -13.9%       0.05 ±  3%  perf-stat.i.metric.K/sec
      2030            +3.0%       2090        perf-stat.i.metric.M/sec
   1837644            +3.0%    1892278        perf-stat.i.minor-faults
  11223565            +3.4%   11602002        perf-stat.i.node-stores
   1837644            +3.0%    1892278        perf-stat.i.page-faults
     24.14            -4.9%      22.94        perf-stat.overall.MPKI
      0.24            -0.0        0.22        perf-stat.overall.branch-miss-rate%
      1.93            +0.2        2.11 ±  2%  perf-stat.overall.cache-miss-rate%
      1.18            -4.2%       1.13        perf-stat.overall.cpi
      2542            -7.9%       2341 ±  2%  perf-stat.overall.cycles-between-cache-misses
      9103           +11.2%      10127        perf-stat.overall.instructions-per-iTLB-miss
      0.85            +4.4%       0.88        perf-stat.overall.ipc
 1.458e+10            +3.2%  1.504e+10        perf-stat.ps.branch-instructions
  34918363            -3.3%   33760716        perf-stat.ps.branch-misses
  22218390            +7.2%   23822438 ±  2%  perf-stat.ps.cache-misses
 1.153e+09            -2.1%  1.129e+09        perf-stat.ps.cache-references
 5.649e+10            -1.3%  5.574e+10        perf-stat.ps.cpu-cycles
    482.86 ±  7%     -31.3%     331.59 ±  4%  perf-stat.ps.cpu-migrations
   8347828            +3.2%    8617195        perf-stat.ps.dTLB-load-misses
 1.377e+10            +3.1%  1.419e+10        perf-stat.ps.dTLB-loads
   1851614            +2.7%    1902131        perf-stat.ps.dTLB-store-misses
 2.833e+09            +2.5%  2.905e+09        perf-stat.ps.dTLB-stores
   5246720            -7.4%    4860447        perf-stat.ps.iTLB-load-misses
    299445           +11.5%     333986        perf-stat.ps.iTLB-loads
 4.776e+10            +3.0%  4.921e+10        perf-stat.ps.instructions
   1828228            +2.9%    1882160        perf-stat.ps.minor-faults
  11166367            +3.3%   11539989        perf-stat.ps.node-stores
   1828228            +2.9%    1882160        perf-stat.ps.page-faults
    253.50 ± 23%     +39.1%     352.50 ± 18%  interrupts.132:IR-PCI-MSI.2097153-edge.eth1-TxRx-0
      4397 ±  7%  +3.2e+05%   14161297 ±  2%  interrupts.CAL:Function_call_interrupts
    302.25 ± 98%    +3e+05%     908756 ±  3%  interrupts.CPU0.CAL:Function_call_interrupts
    402913            -9.5%     364602        interrupts.CPU0.LOC:Local_timer_interrupts
     48441 ± 21%     -65.9%      16507 ±  5%  interrupts.CPU0.RES:Rescheduling_interrupts
    253.50 ± 23%     +39.1%     352.50 ± 18%  interrupts.CPU1.132:IR-PCI-MSI.2097153-edge.eth1-TxRx-0
    174.50 ±  9%  +5.2e+05%     900085 ±  3%  interrupts.CPU1.CAL:Function_call_interrupts
    403045            -9.5%     364819        interrupts.CPU1.LOC:Local_timer_interrupts
     61990 ±  2%     -69.5%      18915 ±  5%  interrupts.CPU1.RES:Rescheduling_interrupts
    201.75 ±  6%  +4.5e+05%     912063 ±  2%  interrupts.CPU10.CAL:Function_call_interrupts
    403200            -9.5%     364987        interrupts.CPU10.LOC:Local_timer_interrupts
     60879 ± 10%     -67.7%      19675 ± 11%  interrupts.CPU10.RES:Rescheduling_interrupts
    195.00 ±  5%  +4.5e+05%     876511        interrupts.CPU11.CAL:Function_call_interrupts
    403082            -9.3%     365585        interrupts.CPU11.LOC:Local_timer_interrupts
     65942 ±  9%     -69.3%      20250 ±  4%  interrupts.CPU11.RES:Rescheduling_interrupts
     63.00 ± 18%     +59.9%     100.75 ± 20%  interrupts.CPU11.TLB:TLB_shootdowns
    222.75 ±  7%  +3.9e+05%     858038        interrupts.CPU12.CAL:Function_call_interrupts
    403047            -9.5%     364864        interrupts.CPU12.LOC:Local_timer_interrupts
     71543 ±  8%     -68.8%      22318 ±  8%  interrupts.CPU12.RES:Rescheduling_interrupts
    207.25 ±  6%  +4.2e+05%     873798 ±  2%  interrupts.CPU13.CAL:Function_call_interrupts
    402881            -9.5%     364551        interrupts.CPU13.LOC:Local_timer_interrupts
     11353 ±  2%     -39.1%       6910 ± 34%  interrupts.CPU13.NMI:Non-maskable_interrupts
     11353 ±  2%     -39.1%       6910 ± 34%  interrupts.CPU13.PMI:Performance_monitoring_interrupts
     77673 ± 16%     -73.0%      20942 ±  6%  interrupts.CPU13.RES:Rescheduling_interrupts
    481.50 ± 58%  +1.8e+05%     868322 ±  2%  interrupts.CPU14.CAL:Function_call_interrupts
    402975            -9.5%     364713        interrupts.CPU14.LOC:Local_timer_interrupts
     66362 ±  6%     -67.8%      21395 ±  9%  interrupts.CPU14.RES:Rescheduling_interrupts
    207.50 ±  7%  +4.2e+05%     881926        interrupts.CPU15.CAL:Function_call_interrupts
    403128            -9.1%     366525        interrupts.CPU15.LOC:Local_timer_interrupts
     58663 ±  4%     -68.7%      18340 ±  3%  interrupts.CPU15.RES:Rescheduling_interrupts
    409.00 ± 97%  +2.2e+05%     902382 ±  2%  interrupts.CPU2.CAL:Function_call_interrupts
    402903            -9.4%     364940        interrupts.CPU2.LOC:Local_timer_interrupts
     72090 ±  2%     -70.7%      21128 ±  4%  interrupts.CPU2.RES:Rescheduling_interrupts
    740.25 ± 43%  +1.2e+05%     871182 ±  3%  interrupts.CPU3.CAL:Function_call_interrupts
    402973            -9.3%     365372        interrupts.CPU3.LOC:Local_timer_interrupts
     61278 ±  8%     -65.9%      20895 ±  8%  interrupts.CPU3.RES:Rescheduling_interrupts
     73.75 ± 14%     +42.0%     104.75 ± 23%  interrupts.CPU3.TLB:TLB_shootdowns
    199.75 ±  7%  +4.4e+05%     875414 ±  4%  interrupts.CPU4.CAL:Function_call_interrupts
    402768            -9.4%     364982        interrupts.CPU4.LOC:Local_timer_interrupts
     65494 ±  8%     -68.9%      20383 ±  8%  interrupts.CPU4.RES:Rescheduling_interrupts
     67.75 ± 30%     +46.9%      99.50 ±  8%  interrupts.CPU4.TLB:TLB_shootdowns
    188.75 ±  5%  +4.7e+05%     888187        interrupts.CPU5.CAL:Function_call_interrupts
    402721            -9.5%     364443        interrupts.CPU5.LOC:Local_timer_interrupts
     73780 ±  4%     -72.2%      20527 ±  6%  interrupts.CPU5.RES:Rescheduling_interrupts
    199.25 ±  6%  +4.4e+05%     884019        interrupts.CPU6.CAL:Function_call_interrupts
    402790            -9.5%     364446        interrupts.CPU6.LOC:Local_timer_interrupts
     59000 ± 15%     -63.2%      21697 ±  6%  interrupts.CPU6.RES:Rescheduling_interrupts
    232.75 ± 17%  +3.8e+05%     886831 ±  3%  interrupts.CPU7.CAL:Function_call_interrupts
    402926            -9.1%     366310        interrupts.CPU7.LOC:Local_timer_interrupts
     52031 ±  9%     -64.0%      18753 ±  6%  interrupts.CPU7.RES:Rescheduling_interrupts
    223.25 ± 21%    +4e+05%     888773 ±  3%  interrupts.CPU8.CAL:Function_call_interrupts
    402910            -9.5%     364567        interrupts.CPU8.LOC:Local_timer_interrupts
     60836 ± 12%     -65.0%      21286 ± 11%  interrupts.CPU8.RES:Rescheduling_interrupts
    212.25 ± 10%  +4.2e+05%     885002 ±  3%  interrupts.CPU9.CAL:Function_call_interrupts
    402785            -9.4%     364751        interrupts.CPU9.LOC:Local_timer_interrupts
     62642 ±  8%     -68.3%      19875 ± 11%  interrupts.CPU9.RES:Rescheduling_interrupts
   6447051            -9.4%    5840463        interrupts.LOC:Local_timer_interrupts
   1018650           -68.3%     322891 ±  2%  interrupts.RES:Rescheduling_interrupts
      1.02 ± 23%      +0.2        1.21 ±  2%  perf-profile.calltrace.cycles-pp.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault.do_user_addr_fault
      0.45 ± 57%      +0.2        0.64        perf-profile.calltrace.cycles-pp.page_add_file_rmap.alloc_set_pte.filemap_map_pages.do_fault.__handle_mm_fault
      0.92 ± 22%      +0.2        1.10 ±  2%  perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.do_fault.__handle_mm_fault.handle_mm_fault
      1.26 ± 23%      +0.2        1.50        perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
      0.81 ± 18%      +0.2        1.06 ±  3%  perf-profile.calltrace.cycles-pp.PageHuge.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
      0.99 ± 22%      +0.3        1.25 ±  2%  perf-profile.calltrace.cycles-pp.xas_find.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
      0.65 ± 57%      +0.3        0.95 ±  3%  perf-profile.calltrace.cycles-pp.up_read.do_user_addr_fault.page_fault
      1.53 ± 22%      +0.3        1.84        perf-profile.calltrace.cycles-pp.alloc_set_pte.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
      0.53 ± 57%      +0.3        0.84 ±  2%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
      0.60 ± 57%      +0.3        0.92 ±  3%  perf-profile.calltrace.cycles-pp.xas_load.xas_find.filemap_map_pages.do_fault.__handle_mm_fault
      2.14 ± 22%      +0.5        2.62        perf-profile.calltrace.cycles-pp.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate
      0.95 ± 57%      +0.5        1.49 ±  5%  perf-profile.calltrace.cycles-pp.down_read_trylock.do_user_addr_fault.page_fault
      2.45 ± 20%      +0.5        2.99        perf-profile.calltrace.cycles-pp.unlock_page.filemap_map_pages.do_fault.__handle_mm_fault.handle_mm_fault
      0.47 ±100%      +0.6        1.05 ±  6%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64
      0.47 ±100%      +0.6        1.05 ±  6%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
      0.47 ±100%      +0.6        1.05 ±  6%  perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64
      0.00            +0.8        0.77 ±  3%  perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending
      0.00            +0.8        0.81 ±  3%  perf-profile.calltrace.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending.flush_smp_call_function_queue
      0.00            +0.8        0.81 ±  3%  perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.flush_smp_call_function_queue.smp_call_function_single_interrupt.call_function_single_interrupt
      0.00            +0.8        0.81 ±  3%  perf-profile.calltrace.cycles-pp.activate_task.ttwu_do_activate.sched_ttwu_pending.flush_smp_call_function_queue.smp_call_function_single_interrupt
      0.00            +0.9        0.88 ±  2%  perf-profile.calltrace.cycles-pp.sched_ttwu_pending.flush_smp_call_function_queue.smp_call_function_single_interrupt.call_function_single_interrupt.finish_task_switch
      0.00            +0.9        0.92 ±  3%  perf-profile.calltrace.cycles-pp.flush_smp_call_function_queue.smp_call_function_single_interrupt.call_function_single_interrupt.finish_task_switch.__sched_text_start
      0.00            +1.0        1.01 ±  2%  perf-profile.calltrace.cycles-pp.smp_call_function_single_interrupt.call_function_single_interrupt.finish_task_switch.__sched_text_start.schedule_idle
      0.00            +1.1        1.12 ±  2%  perf-profile.calltrace.cycles-pp.call_function_single_interrupt.finish_task_switch.__sched_text_start.schedule_idle.do_idle
      0.00            +1.1        1.14 ±  3%  perf-profile.calltrace.cycles-pp.finish_task_switch.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry
      1.11 ± 21%      +1.2        2.30        perf-profile.calltrace.cycles-pp.__sched_text_start.schedule_idle.do_idle.cpu_startup_entry.start_secondary
      1.15 ± 21%      +1.2        2.34 ±  2%  perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     11.57 ± 20%      +2.1       13.63        perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     11.59 ± 20%      +2.1       13.68        perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
     11.59 ± 20%      +2.1       13.67        perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
     12.25 ± 21%      +2.5       14.73        perf-profile.calltrace.cycles-pp.secondary_startup_64
      0.10 ± 22%      +0.0        0.13 ±  6%  perf-profile.children.cycles-pp.irq_exit
      0.08 ± 22%      +0.0        0.12 ±  9%  perf-profile.children.cycles-pp.stack_access_ok
      0.02 ±173%      +0.0        0.07 ± 28%  perf-profile.children.cycles-pp.irq_work_interrupt
      0.02 ±173%      +0.0        0.07 ± 28%  perf-profile.children.cycles-pp.smp_irq_work_interrupt
      0.02 ±173%      +0.0        0.07 ± 28%  perf-profile.children.cycles-pp.printk
      0.22 ± 18%      +0.1        0.27        perf-profile.children.cycles-pp.wake_page_function
      0.03 ±105%      +0.1        0.09 ± 24%  perf-profile.children.cycles-pp.irq_work_run_list
      0.00            +0.1        0.05 ±  8%  perf-profile.children.cycles-pp.__x2apic_send_IPI_dest
      0.26 ± 15%      +0.1        0.31 ±  4%  perf-profile.children.cycles-pp.page_mapping
      0.16 ± 25%      +0.1        0.22 ±  8%  perf-profile.children.cycles-pp.__pagevec_lru_add_fn
      0.00            +0.1        0.06 ± 14%  perf-profile.children.cycles-pp.irq_enter
      0.19 ± 22%      +0.1        0.25 ±  7%  perf-profile.children.cycles-pp.__switch_to_asm
      0.10 ± 24%      +0.1        0.16 ±  9%  perf-profile.children.cycles-pp.in_sched_functions
      0.02 ±173%      +0.1        0.08 ± 27%  perf-profile.children.cycles-pp.irq_work_run
      0.00            +0.1        0.06 ± 17%  perf-profile.children.cycles-pp.in_lock_functions
      0.18 ± 21%      +0.1        0.25 ±  5%  perf-profile.children.cycles-pp.orc_find
      0.27 ± 23%      +0.1        0.34 ±  3%  perf-profile.children.cycles-pp.__switch_to
      0.32 ± 20%      +0.1        0.39 ±  3%  perf-profile.children.cycles-pp.__alloc_pages_nodemask
      0.35 ± 20%      +0.1        0.43 ±  3%  perf-profile.children.cycles-pp.alloc_pages_vma
      0.37 ± 20%      +0.1        0.45 ±  2%  perf-profile.children.cycles-pp.swapgs_restore_regs_and_return_to_usermode
      0.14 ± 20%      +0.1        0.22 ±  5%  perf-profile.children.cycles-pp.native_write_msr
      0.41 ± 21%      +0.1        0.49 ±  4%  perf-profile.children.cycles-pp.shmem_alloc_page
      0.00            +0.1        0.08 ± 10%  perf-profile.children.cycles-pp.llist_add_batch
      0.00            +0.1        0.09 ±  9%  perf-profile.children.cycles-pp.native_apic_msr_eoi_write
      0.12 ± 21%      +0.1        0.22 ±  8%  perf-profile.children.cycles-pp.kernel_text_address
      0.00            +0.1        0.11 ± 12%  perf-profile.children.cycles-pp.generic_exec_single
      0.16 ± 21%      +0.1        0.27 ±  6%  perf-profile.children.cycles-pp.__kernel_text_address
      0.00            +0.1        0.12 ± 11%  perf-profile.children.cycles-pp.smp_call_function_single_async
      0.62 ± 21%      +0.1        0.75 ±  4%  perf-profile.children.cycles-pp.___perf_sw_event
      0.26 ± 24%      +0.1        0.39 ±  4%  perf-profile.children.cycles-pp.__orc_find
      0.20 ± 22%      +0.2        0.35 ±  7%  perf-profile.children.cycles-pp.unwind_get_return_address
      0.08 ± 10%      +0.2        0.24 ±  4%  perf-profile.children.cycles-pp.tick_nohz_idle_enter
      0.77 ± 20%      +0.2        0.95 ±  3%  perf-profile.children.cycles-pp.up_read
      1.03 ± 23%      +0.2        1.21 ±  2%  perf-profile.children.cycles-pp.finish_fault
      0.98 ± 22%      +0.2        1.17        perf-profile.children.cycles-pp.page_add_file_rmap
      0.67 ± 17%      +0.2        0.90        perf-profile.children.cycles-pp.intel_idle
      1.26 ± 23%      +0.2        1.51 ±  2%  perf-profile.children.cycles-pp.find_get_entry
      0.94 ± 22%      +0.3        1.21        perf-profile.children.cycles-pp.xas_load
      1.02 ± 22%      +0.3        1.29 ±  2%  perf-profile.children.cycles-pp.xas_find
      0.00            +0.3        0.27 ±  7%  perf-profile.children.cycles-pp.ttwu_queue_wakelist
      0.99 ± 18%      +0.3        1.29 ±  3%  perf-profile.children.cycles-pp.PageHuge
      1.44 ± 20%      +0.4        1.81        perf-profile.children.cycles-pp.sync_regs
      1.12 ± 23%      +0.4        1.49 ±  5%  perf-profile.children.cycles-pp.down_read_trylock
      0.66 ± 44%      +0.4        1.05 ±  6%  perf-profile.children.cycles-pp.start_kernel
      1.06 ± 22%      +0.4        1.51        perf-profile.children.cycles-pp.unwind_next_frame
      2.02 ± 21%      +0.5        2.49        perf-profile.children.cycles-pp.native_irq_return_iret
      2.51 ± 22%      +0.5        3.03 ±  2%  perf-profile.children.cycles-pp.alloc_set_pte
      2.77 ± 20%      +0.6        3.39        perf-profile.children.cycles-pp.unlock_page
      1.65 ± 22%      +0.7        2.37        perf-profile.children.cycles-pp.arch_stack_walk
      1.80 ± 22%      +0.8        2.55        perf-profile.children.cycles-pp.stack_trace_save_tsk
      2.14 ± 22%      +0.9        3.05        perf-profile.children.cycles-pp.__account_scheduler_latency
      2.78 ± 22%      +1.0        3.75        perf-profile.children.cycles-pp.enqueue_entity
      2.93 ± 21%      +1.0        3.91        perf-profile.children.cycles-pp.activate_task
      2.93 ± 21%      +1.0        3.92        perf-profile.children.cycles-pp.ttwu_do_activate
      2.91 ± 21%      +1.0        3.89        perf-profile.children.cycles-pp.enqueue_task_fair
      1.22 ± 21%      +1.3        2.53 ±  2%  perf-profile.children.cycles-pp.schedule_idle
      0.24 ± 25%      +1.3        1.59 ±  2%  perf-profile.children.cycles-pp.finish_task_switch
      0.00            +1.4        1.43        perf-profile.children.cycles-pp.flush_smp_call_function_queue
      0.00            +1.5        1.50        perf-profile.children.cycles-pp.sched_ttwu_pending
      3.08 ± 22%      +1.6        4.65 ±  2%  perf-profile.children.cycles-pp.__sched_text_start
      0.00            +1.6        1.58        perf-profile.children.cycles-pp.smp_call_function_single_interrupt
      0.00            +1.7        1.74        perf-profile.children.cycles-pp.call_function_single_interrupt
     11.59 ± 20%      +2.1       13.68        perf-profile.children.cycles-pp.start_secondary
     12.25 ± 21%      +2.5       14.71        perf-profile.children.cycles-pp.do_idle
     12.25 ± 21%      +2.5       14.73        perf-profile.children.cycles-pp.secondary_startup_64
     12.25 ± 21%      +2.5       14.73        perf-profile.children.cycles-pp.cpu_startup_entry
      0.60 ± 23%      -0.2        0.35 ±  7%  perf-profile.self.cycles-pp.try_to_wake_up
      0.10 ± 28%      +0.0        0.12 ±  4%  perf-profile.self.cycles-pp.__pagevec_lru_add_fn
      0.11 ± 22%      +0.0        0.15 ±  5%  perf-profile.self.cycles-pp.__do_fault
      0.10 ± 19%      +0.0        0.14 ± 11%  perf-profile.self.cycles-pp.enqueue_entity
      0.04 ± 58%      +0.0        0.08 ± 17%  perf-profile.self.cycles-pp.arch_stack_walk
      0.22 ± 18%      +0.0        0.27        perf-profile.self.cycles-pp.wake_page_function
      0.23 ± 19%      +0.0        0.27 ±  3%  perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
      0.06 ± 58%      +0.0        0.11 ± 10%  perf-profile.self.cycles-pp.stack_access_ok
      0.04 ± 58%      +0.1        0.09 ±  4%  perf-profile.self.cycles-pp.kernel_text_address
      0.00            +0.1        0.05 ±  8%  perf-profile.self.cycles-pp.__kernel_text_address
      0.24 ± 15%      +0.1        0.30 ±  4%  perf-profile.self.cycles-pp.page_mapping
      0.19 ± 22%      +0.1        0.25 ±  7%  perf-profile.self.cycles-pp.__switch_to_asm
      0.18 ± 23%      +0.1        0.24 ±  7%  perf-profile.self.cycles-pp.orc_find
      0.01 ±173%      +0.1        0.08 ± 10%  perf-profile.self.cycles-pp.unwind_get_return_address
      0.26 ± 24%      +0.1        0.33 ±  2%  perf-profile.self.cycles-pp.__switch_to
      0.00            +0.1        0.08 ± 11%  perf-profile.self.cycles-pp.sched_ttwu_pending
      0.14 ± 20%      +0.1        0.22 ±  5%  perf-profile.self.cycles-pp.native_write_msr
      0.00            +0.1        0.08 ± 10%  perf-profile.self.cycles-pp.llist_add_batch
      0.00            +0.1        0.09 ±  9%  perf-profile.self.cycles-pp.native_apic_msr_eoi_write
      0.00            +0.1        0.09 ±  4%  perf-profile.self.cycles-pp.ttwu_queue_wakelist
      0.39 ± 22%      +0.1        0.48 ±  3%  perf-profile.self.cycles-pp.shmem_getpage_gfp
      0.48 ± 21%      +0.1        0.58 ±  4%  perf-profile.self.cycles-pp.___perf_sw_event
      0.34 ± 22%      +0.1        0.45 ±  9%  perf-profile.self.cycles-pp.do_user_addr_fault
      0.53 ± 19%      +0.1        0.65 ±  2%  perf-profile.self.cycles-pp.find_lock_entry
      0.26 ± 25%      +0.1        0.37 ±  8%  perf-profile.self.cycles-pp.__account_scheduler_latency
      0.26 ± 24%      +0.1        0.39 ±  4%  perf-profile.self.cycles-pp.__orc_find
      0.22 ± 18%      +0.2        0.38 ±  4%  perf-profile.self.cycles-pp.finish_task_switch
      0.76 ± 22%      +0.2        0.93        perf-profile.self.cycles-pp.page_add_file_rmap
      0.77 ± 20%      +0.2        0.95 ±  3%  perf-profile.self.cycles-pp.up_read
      0.51 ± 20%      +0.2        0.72 ±  3%  perf-profile.self.cycles-pp.unwind_next_frame
      0.75 ± 22%      +0.2        0.96 ±  2%  perf-profile.self.cycles-pp.xas_load
      0.67 ± 17%      +0.2        0.90        perf-profile.self.cycles-pp.intel_idle
      1.11 ± 22%      +0.3        1.41 ±  4%  perf-profile.self.cycles-pp.__handle_mm_fault
      0.98 ± 18%      +0.3        1.29 ±  3%  perf-profile.self.cycles-pp.PageHuge
      1.43 ± 20%      +0.4        1.79        perf-profile.self.cycles-pp.sync_regs
      1.11 ± 23%      +0.4        1.48 ±  5%  perf-profile.self.cycles-pp.down_read_trylock
      2.02 ± 21%      +0.5        2.49        perf-profile.self.cycles-pp.native_irq_return_iret
      2.72 ± 20%      +0.6        3.33        perf-profile.self.cycles-pp.unlock_page
     18.15 ± 20%      +4.4       22.52        perf-profile.self.cycles-pp.filemap_map_pages





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.7.0-rc6-00031-g2ebb177175506" of type "text/plain" (202616 bytes)

View attachment "job-script" of type "text/plain" (7533 bytes)

View attachment "job.yaml" of type "text/plain" (5131 bytes)

View attachment "reproduce" of type "text/plain" (344 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ