lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20191006121901.GK17687@shao2-debian>
Date:   Sun, 6 Oct 2019 20:19:01 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Waiman Long <longman@...hat.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
        Davidlohr Bueso <dave@...olabs.net>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Will Deacon <will.deacon@....com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [locking/rwsem]  364f784f04:  will-it-scale.per_thread_ops -8.6%
 regression

Greeting,

FYI, we noticed a -8.6% regression of will-it-scale.per_thread_ops due to commit:


commit: 364f784f048c984721986db90c95ca8350213c91 ("locking/rwsem: Optimize rwsem structure for uncontended lock acquisition")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:

	nr_task: 100%
	mode: thread
	test: page_fault2
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.6/thread/100%/debian-x86_64-2019-05-14.cgz/lkp-csl-2ap1/page_fault2/will-it-scale

commit: 
  a8654596f0 ("locking/rwsem: Enable lock event counting")
  364f784f04 ("locking/rwsem: Optimize rwsem structure for uncontended lock acquisition")

a8654596f0371c26 364f784f048c984721986db90c9 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:4           -2%           1:4     perf-profile.children.cycles-pp.error_entry
         %stddev     %change         %stddev
             \          |                \  
     11003 ±  2%      -8.6%      10053        will-it-scale.per_thread_ops
     70653 ±  2%     +11.0%      78446 ±  4%  will-it-scale.time.involuntary_context_switches
 6.818e+08 ±  4%     -10.2%  6.124e+08 ±  3%  will-it-scale.time.minor_page_faults
     10405            +7.5%      11184        will-it-scale.time.percent_of_cpu_this_job_got
    241.21 ±  6%     -19.2%     194.93 ±  3%  will-it-scale.time.user_time
   5070481 ±  3%      -5.2%    4807787 ±  3%  will-it-scale.time.voluntary_context_switches
   2112773 ±  2%      -8.6%    1930300        will-it-scale.workload
      6197            +0.7%       6239        boot-time.idle
 1.922e+08 ±  5%     -10.7%  1.716e+08 ±  2%  numa-numastat.node1.local_node
 1.922e+08 ±  5%     -10.7%  1.716e+08 ±  2%  numa-numastat.node1.numa_hit
  98587680 ±  5%     -10.8%   87968117 ±  2%  numa-vmstat.node1.numa_hit
  98470865 ±  5%     -10.8%   87856550 ±  2%  numa-vmstat.node1.numa_local
     45.50            -4.0       41.51        mpstat.cpu.all.idle%
      0.00 ± 15%      +0.0        0.00 ±  7%  mpstat.cpu.all.soft%
      0.41 ±  5%      -0.1        0.34        mpstat.cpu.all.usr%
     45.25            -8.8%      41.25        vmstat.cpu.id
     53.25            +8.0%      57.50        vmstat.cpu.sy
     32467            -3.3%      31402        vmstat.system.cs
      2321           +34.0%       3112        slabinfo.task_struct.active_objs
      2335           -66.7%     778.00        slabinfo.task_struct.active_slabs
      2335           +33.4%       3114        slabinfo.task_struct.num_objs
      2335           -66.7%     778.00        slabinfo.task_struct.num_slabs
      1611            +7.3%       1729        turbostat.Avg_MHz
     43.82 ±  2%      -8.2%      40.23        turbostat.CPU%c1
      0.45 ± 49%     -59.8%       0.18 ±  6%  turbostat.Pkg%pc2
    358.05            +1.8%     364.46        turbostat.PkgWatt
     67050            -5.3%      63472        proc-vmstat.nr_slab_unreclaimable
 6.908e+08 ±  4%     -10.1%  6.212e+08 ±  2%  proc-vmstat.numa_hit
 6.907e+08 ±  4%     -10.1%  6.211e+08 ±  2%  proc-vmstat.numa_local
 6.909e+08 ±  4%     -10.1%  6.214e+08 ±  2%  proc-vmstat.pgalloc_normal
 6.829e+08 ±  4%     -10.2%  6.135e+08 ±  3%  proc-vmstat.pgfault
 6.907e+08 ±  4%     -10.1%  6.212e+08 ±  3%  proc-vmstat.pgfree
     36167 ±  4%      -7.9%      33325        softirqs.CPU0.SCHED
     21808 ±  5%      -8.2%      20028        softirqs.CPU101.RCU
     22463 ±  3%     -13.7%      19393 ±  3%  softirqs.CPU105.RCU
     21973 ±  4%      -8.6%      20082 ±  4%  softirqs.CPU106.RCU
     22115 ±  3%     -10.4%      19808 ±  2%  softirqs.CPU107.RCU
     32888 ±  3%      -8.8%      29994 ±  2%  softirqs.CPU11.SCHED
     32399 ±  3%      -9.9%      29180 ±  2%  softirqs.CPU110.SCHED
     23350 ±  8%     -10.8%      20817 ±  2%  softirqs.CPU2.RCU
     20532 ±  6%     +12.3%      23062 ±  4%  softirqs.CPU51.RCU
     32903 ±  3%      -8.8%      30006 ±  2%  softirqs.CPU89.SCHED
     22179 ±  5%      -8.5%      20296 ±  4%  softirqs.CPU97.RCU
 2.357e+09 ±  2%      -8.5%  2.157e+09        perf-stat.i.branch-instructions
     68.70            +2.7       71.37        perf-stat.i.cache-miss-rate%
 2.068e+08           -11.2%  1.837e+08 ±  4%  perf-stat.i.cache-misses
 2.999e+08           -14.5%  2.564e+08 ±  5%  perf-stat.i.cache-references
     32849            -3.2%      31781        perf-stat.i.context-switches
     25.79 ±  3%     +16.6%      30.09 ±  2%  perf-stat.i.cpi
 3.061e+11            +7.2%  3.281e+11        perf-stat.i.cpu-cycles
      1472 ±  2%     +21.5%       1789 ±  5%  perf-stat.i.cycles-between-cache-misses
 3.164e+09 ±  2%      -8.0%  2.912e+09        perf-stat.i.dTLB-loads
      1.14            -0.0        1.12        perf-stat.i.dTLB-store-miss-rate%
  18451516 ±  2%     -10.2%   16572608        perf-stat.i.dTLB-store-misses
 1.582e+09            -8.3%   1.45e+09        perf-stat.i.dTLB-stores
  1.18e+10 ±  2%      -8.3%  1.082e+10        perf-stat.i.instructions
      2303 ±  3%      -7.5%       2131        perf-stat.i.instructions-per-iTLB-miss
      0.04 ±  3%      -9.0%       0.04 ±  4%  perf-stat.i.ipc
   2090997 ±  2%      -8.8%    1906802        perf-stat.i.minor-faults
  62234770 ±  2%     -16.8%   51765588 ±  2%  perf-stat.i.node-loads
     25.30 ±  4%      -6.6       18.65 ±  2%  perf-stat.i.node-store-miss-rate%
   3416019 ±  2%     -28.7%    2434638 ±  2%  perf-stat.i.node-store-misses
  10406782 ±  2%      +7.8%   11220238        perf-stat.i.node-stores
   2090978 ±  2%      -8.8%    1906807        perf-stat.i.page-faults
     68.96            +2.8       71.71        perf-stat.overall.cache-miss-rate%
     25.96 ±  3%     +16.8%      30.32 ±  2%  perf-stat.overall.cpi
      1480 ±  2%     +20.8%       1789 ±  3%  perf-stat.overall.cycles-between-cache-misses
      1.15            -0.0        1.13        perf-stat.overall.dTLB-store-miss-rate%
      2300 ±  3%      -7.3%       2131        perf-stat.overall.instructions-per-iTLB-miss
      0.04 ±  3%     -14.5%       0.03 ±  2%  perf-stat.overall.ipc
     24.72 ±  3%      -6.9       17.83 ±  2%  perf-stat.overall.node-store-miss-rate%
  2.35e+09 ±  2%      -8.5%  2.151e+09        perf-stat.ps.branch-instructions
 2.062e+08           -11.2%  1.831e+08 ±  4%  perf-stat.ps.cache-misses
  2.99e+08           -14.5%  2.556e+08 ±  5%  perf-stat.ps.cache-references
     32745            -3.2%      31682        perf-stat.ps.context-switches
 3.052e+11            +7.2%  3.271e+11        perf-stat.ps.cpu-cycles
 3.155e+09 ±  2%      -8.0%  2.903e+09        perf-stat.ps.dTLB-loads
  18398373 ±  2%     -10.2%   16523779        perf-stat.ps.dTLB-store-misses
 1.577e+09            -8.3%  1.446e+09        perf-stat.ps.dTLB-stores
 1.177e+10 ±  2%      -8.3%  1.079e+10        perf-stat.ps.instructions
   2084539 ±  2%      -8.8%    1900793        perf-stat.ps.minor-faults
  62045336 ±  2%     -16.8%   51607027 ±  2%  perf-stat.ps.node-loads
   3405719 ±  2%     -28.7%    2427181 ±  2%  perf-stat.ps.node-store-misses
  10375665 ±  2%      +7.8%   11186150        perf-stat.ps.node-stores
   2084578 ±  2%      -8.8%    1900864        perf-stat.ps.page-faults
 3.811e+12 ±  4%     -10.1%  3.427e+12 ±  3%  perf-stat.total.instructions
    470.25 ±  3%     -14.7%     401.00 ±  8%  interrupts.CPU126.RES:Rescheduling_interrupts
    454.50 ±  4%     -15.3%     384.75 ± 10%  interrupts.CPU127.RES:Rescheduling_interrupts
    480.00 ±  6%     -17.1%     398.00 ± 10%  interrupts.CPU136.RES:Rescheduling_interrupts
    474.50 ±  8%     -17.4%     391.75 ±  5%  interrupts.CPU138.RES:Rescheduling_interrupts
    493.50 ±  6%     -14.8%     420.25 ±  6%  interrupts.CPU143.RES:Rescheduling_interrupts
    554.00           -16.1%     464.75 ±  3%  interrupts.CPU144.RES:Rescheduling_interrupts
    476.25 ±  8%     -16.0%     400.00 ±  4%  interrupts.CPU148.RES:Rescheduling_interrupts
    461.75 ±  5%     -14.5%     395.00 ± 11%  interrupts.CPU149.RES:Rescheduling_interrupts
    448.75 ±  4%     -12.5%     392.50 ± 10%  interrupts.CPU150.RES:Rescheduling_interrupts
    497.50 ±  9%     -16.2%     417.00 ±  7%  interrupts.CPU151.RES:Rescheduling_interrupts
    476.25 ±  5%     -14.8%     406.00 ± 12%  interrupts.CPU153.RES:Rescheduling_interrupts
    502.00 ±  5%     -18.8%     407.75 ±  5%  interrupts.CPU154.RES:Rescheduling_interrupts
    486.75 ±  3%     -14.5%     416.25 ±  8%  interrupts.CPU156.RES:Rescheduling_interrupts
    468.75 ±  6%     -14.5%     400.75 ±  8%  interrupts.CPU161.RES:Rescheduling_interrupts
    465.25 ±  7%     -13.1%     404.50 ±  5%  interrupts.CPU165.RES:Rescheduling_interrupts
    477.75 ±  8%     -13.6%     413.00 ±  6%  interrupts.CPU166.RES:Rescheduling_interrupts
    593.00 ± 16%     -25.8%     440.25 ±  8%  interrupts.CPU167.RES:Rescheduling_interrupts
    569.25 ± 10%     -14.1%     489.25 ±  4%  interrupts.CPU168.RES:Rescheduling_interrupts
    496.75 ±  4%     -14.5%     424.50 ±  9%  interrupts.CPU175.RES:Rescheduling_interrupts
    489.50 ± 10%     -20.4%     389.75 ±  4%  interrupts.CPU176.RES:Rescheduling_interrupts
    499.50 ± 10%     -24.6%     376.75 ± 13%  interrupts.CPU177.RES:Rescheduling_interrupts
    509.75 ± 11%     -19.4%     410.75 ±  4%  interrupts.CPU179.RES:Rescheduling_interrupts
    476.50 ±  8%     -17.2%     394.50 ± 11%  interrupts.CPU180.RES:Rescheduling_interrupts
    489.75 ± 10%     -21.7%     383.25 ± 11%  interrupts.CPU182.RES:Rescheduling_interrupts
    502.00 ± 11%     -16.7%     418.25 ± 10%  interrupts.CPU191.RES:Rescheduling_interrupts
    464.00           -12.4%     406.25 ±  6%  interrupts.CPU28.RES:Rescheduling_interrupts
    461.50 ±  5%     -11.3%     409.50 ±  5%  interrupts.CPU45.RES:Rescheduling_interrupts
    564.00 ±  3%     -15.7%     475.50 ±  8%  interrupts.CPU48.RES:Rescheduling_interrupts
    491.00 ±  7%     -14.6%     419.50 ±  4%  interrupts.CPU49.RES:Rescheduling_interrupts
    465.75 ±  6%     -14.8%     396.75 ±  8%  interrupts.CPU53.RES:Rescheduling_interrupts
    498.75 ±  7%     -21.0%     394.00 ±  6%  interrupts.CPU54.RES:Rescheduling_interrupts
    474.00 ±  3%     -12.7%     414.00 ±  8%  interrupts.CPU57.RES:Rescheduling_interrupts
    478.00 ±  7%     -19.9%     382.75 ±  5%  interrupts.CPU59.RES:Rescheduling_interrupts
    463.50 ±  5%     -16.1%     388.75 ± 10%  interrupts.CPU61.RES:Rescheduling_interrupts
    468.00 ±  9%     -20.2%     373.25 ±  9%  interrupts.CPU62.RES:Rescheduling_interrupts
    465.75 ±  4%     -15.5%     393.50 ±  9%  interrupts.CPU64.RES:Rescheduling_interrupts
      3346 ± 33%     +44.2%       4826        interrupts.CPU65.NMI:Non-maskable_interrupts
      3346 ± 33%     +44.2%       4826        interrupts.CPU65.PMI:Performance_monitoring_interrupts
    460.25 ±  5%     -13.5%     398.00 ±  3%  interrupts.CPU65.RES:Rescheduling_interrupts
      3339 ± 33%     +45.0%       4843        interrupts.CPU66.NMI:Non-maskable_interrupts
      3339 ± 33%     +45.0%       4843        interrupts.CPU66.PMI:Performance_monitoring_interrupts
    592.50 ± 31%     -30.4%     412.50 ±  6%  interrupts.CPU81.RES:Rescheduling_interrupts
    483.50 ±  9%     -16.3%     404.50 ±  9%  interrupts.CPU94.RES:Rescheduling_interrupts
     92682 ±  6%      -7.0%      86177 ±  5%  interrupts.RES:Rescheduling_interrupts
     82350          -100.0%       0.00        sched_debug.cfs_rq:/.exec_clock.avg
     83827          -100.0%       0.00        sched_debug.cfs_rq:/.exec_clock.max
     79902          -100.0%       0.00        sched_debug.cfs_rq:/.exec_clock.min
    595.00 ±  9%    -100.0%       0.00        sched_debug.cfs_rq:/.exec_clock.stddev
      2512 ± 13%     -38.9%       1535 ± 26%  sched_debug.cfs_rq:/.load.avg
    233913 ± 32%     -74.7%      59149 ±126%  sched_debug.cfs_rq:/.load.max
     18849 ± 28%     -65.2%       6566 ± 79%  sched_debug.cfs_rq:/.load.stddev
   9703608 ±  3%     +14.9%   11145876        sched_debug.cfs_rq:/.min_vruntime.avg
   9803403 ±  3%     +14.5%   11221751        sched_debug.cfs_rq:/.min_vruntime.max
   9410664 ±  3%     +15.4%   10861526        sched_debug.cfs_rq:/.min_vruntime.min
     67038 ±  5%     -24.0%      50961 ± 10%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.49 ± 12%    -100.0%       0.00        sched_debug.cfs_rq:/.nr_spread_over.avg
      8.71 ± 19%    -100.0%       0.00        sched_debug.cfs_rq:/.nr_spread_over.max
      1.07 ±  9%    -100.0%       0.00        sched_debug.cfs_rq:/.nr_spread_over.stddev
      1.83 ± 18%     -47.5%       0.96 ± 38%  sched_debug.cfs_rq:/.runnable_load_avg.avg
    224.29 ± 32%     -76.4%      53.04 ±136%  sched_debug.cfs_rq:/.runnable_load_avg.max
     17.14 ± 30%     -70.5%       5.06 ±102%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
      2483 ± 12%     -39.1%       1512 ± 26%  sched_debug.cfs_rq:/.runnable_weight.avg
    231331 ± 32%     -75.3%      57065 ±131%  sched_debug.cfs_rq:/.runnable_weight.max
     18727 ± 28%     -65.5%       6464 ± 80%  sched_debug.cfs_rq:/.runnable_weight.stddev
     67046 ±  5%     -24.1%      50879 ± 11%  sched_debug.cfs_rq:/.spread0.stddev
     99583 ±  6%     +18.9%     118389 ±  3%  sched_debug.cpu.avg_idle.stddev
      1.92 ± 18%     -47.1%       1.02 ± 34%  sched_debug.cpu.cpu_load[0].avg
    224.62 ± 33%     -76.0%      53.88 ±133%  sched_debug.cpu.cpu_load[0].max
     17.17 ± 30%     -70.5%       5.07 ±100%  sched_debug.cpu.cpu_load[0].stddev
      2.10 ± 16%     -43.1%       1.20 ± 27%  sched_debug.cpu.cpu_load[1].avg
    224.00 ± 32%     -76.1%      53.54 ±134%  sched_debug.cpu.cpu_load[1].max
     17.04 ± 30%     -71.0%       4.94 ±103%  sched_debug.cpu.cpu_load[1].stddev
      2.29 ± 14%     -41.0%       1.35 ± 25%  sched_debug.cpu.cpu_load[2].avg
    223.12 ± 32%     -75.9%      53.75 ±134%  sched_debug.cpu.cpu_load[2].max
     17.01 ± 30%     -71.0%       4.94 ±103%  sched_debug.cpu.cpu_load[2].stddev
      2.39 ± 12%     -40.4%       1.43 ± 24%  sched_debug.cpu.cpu_load[3].avg
    223.88 ± 31%     -75.0%      55.92 ±128%  sched_debug.cpu.cpu_load[3].max
     17.05 ± 29%     -70.1%       5.11 ± 98%  sched_debug.cpu.cpu_load[3].stddev
      2.42 ± 11%     -37.3%       1.52 ± 24%  sched_debug.cpu.cpu_load[4].avg
    231.38 ± 28%     -66.6%      77.29 ± 91%  sched_debug.cpu.cpu_load[4].max
     17.50 ± 27%     -63.2%       6.45 ± 77%  sched_debug.cpu.cpu_load[4].stddev
      2464 ± 13%     -40.5%       1467 ± 25%  sched_debug.cpu.load.avg
    233628 ± 32%     -74.8%      58842 ±127%  sched_debug.cpu.load.max
     18747 ± 28%     -65.7%       6421 ± 80%  sched_debug.cpu.load.stddev
      1862 ±  7%     +10.3%       2055 ±  5%  sched_debug.cpu.nr_switches.stddev
     25907          -100.0%       0.00        sched_debug.cpu.sched_count.avg
     35456 ±  2%    -100.0%       0.00        sched_debug.cpu.sched_count.max
     24844          -100.0%       0.00        sched_debug.cpu.sched_count.min
      1410 ± 10%    -100.0%       0.00        sched_debug.cpu.sched_count.stddev
     12738          -100.0%       0.00        sched_debug.cpu.sched_goidle.avg
     17314          -100.0%       0.00        sched_debug.cpu.sched_goidle.max
     12190          -100.0%       0.00        sched_debug.cpu.sched_goidle.min
    670.77 ±  6%    -100.0%       0.00        sched_debug.cpu.sched_goidle.stddev
     12930          -100.0%       0.00        sched_debug.cpu.ttwu_count.avg
     19266 ±  4%    -100.0%       0.00        sched_debug.cpu.ttwu_count.max
      7000 ±  2%    -100.0%       0.00        sched_debug.cpu.ttwu_count.min
      2676 ±  7%    -100.0%       0.00        sched_debug.cpu.ttwu_count.stddev
    324.30          -100.0%       0.00        sched_debug.cpu.ttwu_local.avg
      1152 ±  8%    -100.0%       0.00        sched_debug.cpu.ttwu_local.max
    220.62          -100.0%       0.00        sched_debug.cpu.ttwu_local.min
     97.33 ±  7%    -100.0%       0.00        sched_debug.cpu.ttwu_local.stddev
      0.01           -75.0%       0.00 ±173%  sched_debug.rt_rq:/.rt_nr_migratory.stddev
      0.01           -75.0%       0.00 ±173%  sched_debug.rt_rq:/.rt_nr_running.stddev
     94.86            -3.3       91.51        perf-profile.calltrace.cycles-pp.page_fault.testcase
     93.91            -3.0       90.86        perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.testcase
     94.06            -3.0       91.02        perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.testcase
     53.17            -2.6       50.59        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
      4.69 ±  4%      -1.3        3.34 ±  3%  perf-profile.calltrace.cycles-pp.put_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      9.96 ±  2%      -1.0        8.99        perf-profile.calltrace.cycles-pp.up_read.__do_page_fault.do_page_fault.page_fault.testcase
      8.49            -0.9        7.57        perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      4.38 ±  3%      -0.7        3.71 ±  2%  perf-profile.calltrace.cycles-pp.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
      2.99 ±  4%      -0.6        2.38 ±  2%  perf-profile.calltrace.cycles-pp.secondary_startup_64
      2.98 ±  4%      -0.6        2.36 ±  2%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      2.98 ±  4%      -0.6        2.37 ±  2%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
      2.98 ±  4%      -0.6        2.37 ±  2%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
      2.53 ±  5%      -0.5        2.02 ±  3%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      3.22 ±  4%      -0.5        2.75 ±  3%  perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault
      2.04 ±  6%      -0.4        1.60 ±  3%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
      2.08 ±  6%      -0.4        1.64 ±  4%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault
      1.96 ±  6%      -0.4        1.54 ±  4%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte
      1.45            -0.3        1.16        perf-profile.calltrace.cycles-pp.unlock_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      0.67 ±  3%      -0.1        0.57        perf-profile.calltrace.cycles-pp.lru_cache_add_active_or_unevictable.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
      0.83            -0.1        0.76 ±  2%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap
      0.83            -0.1        0.76 ±  2%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
      0.83            -0.1        0.76 ±  2%  perf-profile.calltrace.cycles-pp.munmap
      0.83            -0.1        0.76 ±  2%  perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
      0.83            -0.1        0.76 ±  2%  perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
      0.82            -0.1        0.75 ±  3%  perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.78            -0.1        0.73 ±  2%  perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
      0.74            -0.1        0.69 ±  2%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
      0.75            -0.1        0.70 ±  3%  perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
      0.87            -0.0        0.82        perf-profile.calltrace.cycles-pp.__pagevec_lru_add_fn.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault
      4.00            +0.2        4.17        perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault
      4.03            +0.2        4.21        perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault
     95.80            +0.5       96.28        perf-profile.calltrace.cycles-pp.testcase
     14.11            +1.7       15.76        perf-profile.calltrace.cycles-pp.copy_page.copy_user_highpage.__handle_mm_fault.handle_mm_fault.__do_page_fault
     14.42            +1.7       16.16        perf-profile.calltrace.cycles-pp.copy_user_highpage.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      0.25 ±173%      +4.0        4.24 ±  8%  perf-profile.calltrace.cycles-pp.handle_mm_fault.testcase
     94.62            -3.1       91.53        perf-profile.children.cycles-pp.page_fault
     93.93            -3.0       90.88        perf-profile.children.cycles-pp.__do_page_fault
     94.08            -3.0       91.03        perf-profile.children.cycles-pp.do_page_fault
     53.30            -2.6       50.69        perf-profile.children.cycles-pp.__handle_mm_fault
      4.71 ±  4%      -1.3        3.37 ±  3%  perf-profile.children.cycles-pp.put_page
     10.07 ±  2%      -1.0        9.11        perf-profile.children.cycles-pp.up_read
      8.49            -0.9        7.58        perf-profile.children.cycles-pp.finish_fault
      4.40 ±  3%      -0.7        3.71 ±  2%  perf-profile.children.cycles-pp.__lru_cache_add
      2.99 ±  4%      -0.6        2.38 ±  2%  perf-profile.children.cycles-pp.secondary_startup_64
      2.99 ±  4%      -0.6        2.38 ±  2%  perf-profile.children.cycles-pp.cpu_startup_entry
      2.99 ±  4%      -0.6        2.38 ±  2%  perf-profile.children.cycles-pp.do_idle
      2.98 ±  4%      -0.6        2.37 ±  2%  perf-profile.children.cycles-pp.start_secondary
      2.58 ±  4%      -0.5        2.07 ±  3%  perf-profile.children.cycles-pp.cpuidle_enter_state
      3.23 ±  4%      -0.5        2.75 ±  3%  perf-profile.children.cycles-pp.pagevec_lru_move_fn
      2.17 ±  6%      -0.5        1.71 ±  4%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      2.05 ±  6%      -0.4        1.60 ±  3%  perf-profile.children.cycles-pp.intel_idle
      1.46            -0.3        1.18        perf-profile.children.cycles-pp.unlock_page
      1.73 ±  3%      -0.3        1.47 ±  6%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
      1.46 ±  3%      -0.2        1.23 ±  8%  perf-profile.children.cycles-pp.hrtimer_interrupt
      1.12 ±  2%      -0.2        0.91 ±  7%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      0.90            -0.2        0.69 ±  9%  perf-profile.children.cycles-pp.tick_sched_timer
      0.85 ±  2%      -0.2        0.64 ±  9%  perf-profile.children.cycles-pp.update_process_times
      0.86 ±  2%      -0.2        0.65 ±  9%  perf-profile.children.cycles-pp.tick_sched_handle
      0.73 ±  4%      -0.2        0.53 ±  7%  perf-profile.children.cycles-pp.scheduler_tick
      0.67 ±  5%      -0.2        0.48 ±  7%  perf-profile.children.cycles-pp.task_tick_fair
      0.57 ±  7%      -0.1        0.45        perf-profile.children.cycles-pp.native_irq_return_iret
      0.68 ±  2%      -0.1        0.58 ±  2%  perf-profile.children.cycles-pp.lru_cache_add_active_or_unevictable
      0.28 ±  8%      -0.1        0.20 ±  4%  perf-profile.children.cycles-pp.update_cfs_group
      0.92            -0.1        0.85 ±  3%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.92            -0.1        0.85 ±  3%  perf-profile.children.cycles-pp.do_syscall_64
      0.82            -0.1        0.75 ±  3%  perf-profile.children.cycles-pp.__do_munmap
      0.83            -0.1        0.76 ±  2%  perf-profile.children.cycles-pp.munmap
      0.83            -0.1        0.76 ±  2%  perf-profile.children.cycles-pp.__vm_munmap
      0.83            -0.1        0.76 ±  2%  perf-profile.children.cycles-pp.__x64_sys_munmap
      0.78            -0.1        0.73 ±  2%  perf-profile.children.cycles-pp.unmap_region
      0.75            -0.0        0.70 ±  3%  perf-profile.children.cycles-pp.unmap_vmas
      0.75            -0.0        0.70 ±  3%  perf-profile.children.cycles-pp.unmap_page_range
      0.26 ±  6%      -0.0        0.22 ±  8%  perf-profile.children.cycles-pp.___might_sleep
      0.93            -0.0        0.89        perf-profile.children.cycles-pp.__pagevec_lru_add_fn
      0.18 ±  6%      -0.0        0.15 ±  2%  perf-profile.children.cycles-pp.menu_select
      0.23 ±  5%      -0.0        0.20 ±  3%  perf-profile.children.cycles-pp.__sched_text_start
      0.23 ±  3%      -0.0        0.20 ±  4%  perf-profile.children.cycles-pp.irq_exit
      0.38 ±  3%      -0.0        0.35        perf-profile.children.cycles-pp.tlb_flush_mmu_free
      0.18 ±  2%      -0.0        0.15 ±  5%  perf-profile.children.cycles-pp.__softirqentry_text_start
      0.11 ±  6%      -0.0        0.09 ±  7%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.25 ±  3%      -0.0        0.23 ±  3%  perf-profile.children.cycles-pp.free_unref_page_list
      0.10 ±  5%      -0.0        0.08 ±  5%  perf-profile.children.cycles-pp.schedule_idle
      0.17 ±  2%      +0.0        0.21 ±  6%  perf-profile.children.cycles-pp.page_add_new_anon_rmap
      0.21 ±  5%      +0.0        0.26 ±  9%  perf-profile.children.cycles-pp.page_mapping
      0.00            +0.1        0.05 ±  9%  perf-profile.children.cycles-pp._cond_resched
      4.01            +0.2        4.18        perf-profile.children.cycles-pp.find_lock_entry
      4.04            +0.2        4.22        perf-profile.children.cycles-pp.shmem_getpage_gfp
      0.03 ±173%      +0.3        0.37        perf-profile.children.cycles-pp.mem_cgroup_from_task
     96.04            +0.7       96.73        perf-profile.children.cycles-pp.testcase
     14.42            +1.8       16.17        perf-profile.children.cycles-pp.copy_user_highpage
     14.36            +1.8       16.11        perf-profile.children.cycles-pp.copy_page
     54.36            +3.6       57.95        perf-profile.children.cycles-pp.handle_mm_fault
     11.18            -1.7        9.46        perf-profile.self.cycles-pp.__handle_mm_fault
      4.64 ±  4%      -1.3        3.32 ±  3%  perf-profile.self.cycles-pp.put_page
      9.93 ±  2%      -0.9        9.00        perf-profile.self.cycles-pp.up_read
      2.04 ±  6%      -0.4        1.60 ±  3%  perf-profile.self.cycles-pp.intel_idle
      1.44            -0.3        1.17        perf-profile.self.cycles-pp.unlock_page
      1.16            -0.2        0.95        perf-profile.self.cycles-pp.__lru_cache_add
      0.57 ±  7%      -0.1        0.45        perf-profile.self.cycles-pp.native_irq_return_iret
      0.67 ±  2%      -0.1        0.57 ±  2%  perf-profile.self.cycles-pp.lru_cache_add_active_or_unevictable
      0.28 ±  8%      -0.1        0.20 ±  4%  perf-profile.self.cycles-pp.update_cfs_group
      0.29 ±  3%      -0.1        0.20 ±  9%  perf-profile.self.cycles-pp.task_tick_fair
      0.82 ±  2%      -0.1        0.76        perf-profile.self.cycles-pp.__pagevec_lru_add_fn
      0.60 ±  4%      -0.1        0.54 ±  3%  perf-profile.self.cycles-pp.testcase
      0.26 ±  5%      -0.0        0.22 ±  6%  perf-profile.self.cycles-pp.___might_sleep
      0.18 ±  2%      -0.0        0.15 ±  4%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.08 ± 15%      -0.0        0.06        perf-profile.self.cycles-pp.switch_mm_irqs_off
      0.12 ±  4%      -0.0        0.11 ±  3%  perf-profile.self.cycles-pp.free_pcppages_bulk
      0.18 ±  2%      +0.0        0.22 ±  5%  perf-profile.self.cycles-pp.__mod_node_page_state
      0.60            +0.1        0.71        perf-profile.self.cycles-pp.get_page_from_freelist
      0.86 ±  3%      +0.1        0.98 ±  7%  perf-profile.self.cycles-pp.find_lock_entry
      0.00            +0.1        0.13 ±  6%  perf-profile.self.cycles-pp.mem_cgroup_from_task
     14.13            +1.8       15.90        perf-profile.self.cycles-pp.copy_page
      0.99 ± 52%      +6.0        7.01 ±  4%  perf-profile.self.cycles-pp.handle_mm_fault


                                                                                
                            will-it-scale.per_thread_ops                        
                                                                                
  12000 +-+-----------------------------------------------------------------+   
        | +..                                                               |   
        |+             .+..           +    +                                |   
  11500 +-+  +. .+..+.+    +.+.       ::   ::                               |   
        |      +               +..+  :  : : :   +.     +                    |   
        |                          + :  : :  : :  +.. : :                   |   
  11000 +-+                         +    +   : :      : :                   |   
        |                                     +      +   +                  |   
  10500 +-+                                                                 |   
        |                                       O        O    O             |   
        |      O O  O      O               O  O   O             O         O |   
  10000 +-+  O          O      O    O O              O               O      O   
        O O           O      O           O             O    O      O   O    |   
        |                         O                                         |   
   9500 +-+-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.1.0-rc4-00070-g364f784f0" of type "text/plain" (188810 bytes)

View attachment "job-script" of type "text/plain" (7389 bytes)

View attachment "job.yaml" of type "text/plain" (5077 bytes)

View attachment "reproduce" of type "text/plain" (315 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ