lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 8 Dec 2019 23:39:49 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     David Howells <dhowells@...hat.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        lkp@...ts.01.org
Subject: [pipe] 3c0edea9b2:  lmbench3.PIPE.bandwidth.MB/sec -17.0% regression

Greeting,

FYI, we noticed a -17.0% regression of lmbench3.PIPE.bandwidth.MB/sec due to commit:


commit: 3c0edea9b29f9be6c093f236f762202b30ac9431 ("pipe: Remove sync on wake_ups")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: lmbench3
on test machine: 16 threads Intel(R) Xeon(R) CPU D-1541 @ 2.10GHz with 48G memory
with following parameters:

	test_memory_size: 50%
	nr_threads: 100%
	mode: development
	test: PIPE
	cpufreq_governor: performance
	ucode: 0x7000019

test-url: http://www.bitmover.com/lmbench/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_threads/rootfs/tbox_group/test/test_memory_size/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.6/development/100%/debian-x86_64-2019-11-14.cgz/lkp-bdw-de1/PIPE/50%/lmbench3/0x7000019

commit: 
  cefa80ced5 ("pipe: Increase the writer-wakeup threshold to reduce context-switch count")
  3c0edea9b2 ("pipe: Remove sync on wake_ups")

cefa80ced57a2917 3c0edea9b29f9be6c093f236f76 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     21204           -17.0%      17604        lmbench3.PIPE.bandwidth.MB/sec
     14.06           +91.9%      26.99        lmbench3.PIPE.latency.us
    134.47           +25.9%     169.28 ±  7%  lmbench3.time.elapsed_time
    134.47           +25.9%     169.28 ±  7%  lmbench3.time.elapsed_time.max
   3334392 ±  9%     -78.2%     725930 ±  4%  lmbench3.time.involuntary_context_switches
      1230 ±  2%     +25.4%       1542        lmbench3.time.system_time
  66167960 ±  2%     -24.0%   50257054        lmbench3.time.voluntary_context_switches
      5.83            -1.0        4.80        mpstat.cpu.all.usr%
    539839 ±  2%     -17.4%     445879 ±  6%  vmstat.system.cs
     39.29            -2.5%      38.31        boot-time.boot
    563.99            -2.7%     548.62        boot-time.idle
      6437            -1.7%       6330        proc-vmstat.nr_mapped
    683646            +5.9%     724211 ±  2%  proc-vmstat.pgfault
   1330273 ± 37%   +1281.6%   18379318        turbostat.C1
      0.74 ± 65%      +2.6        3.37 ±  6%  turbostat.C1%
   4392819 ±  5%     +31.9%    5792490 ±  5%  turbostat.IRQ
  16257279 ± 65%    +462.7%   91474668        cpuidle.C1.time
   1331447 ± 37%   +1280.5%   18380948        cpuidle.C1.usage
   5997091 ± 32%    +266.2%   21964255        cpuidle.POLL.time
   3142700 ±  5%    +121.5%    6962552        cpuidle.POLL.usage
      0.00          +9e+11%       8988 ±111%  sched_debug.cfs_rq:/.MIN_vruntime.avg
      0.00        +7.7e+12%      76535 ± 86%  sched_debug.cfs_rq:/.MIN_vruntime.max
      0.00 ± 15%  +1.2e+28%      22897 ± 92%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
     17.75 ± 16%     +66.7%      29.58 ± 20%  sched_debug.cfs_rq:/.load_avg.min
    155.25 ±  8%     -22.5%     120.28 ±  7%  sched_debug.cfs_rq:/.load_avg.stddev
      0.00          +9e+11%       8988 ±111%  sched_debug.cfs_rq:/.max_vruntime.avg
      0.00        +7.7e+12%      76535 ± 86%  sched_debug.cfs_rq:/.max_vruntime.max
      0.00 ± 15%  +1.2e+28%      22897 ± 92%  sched_debug.cfs_rq:/.max_vruntime.stddev
     63.45 ± 22%     -41.5%      37.15 ± 32%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
    247.25 ± 20%     +93.6%     478.77 ± 16%  sched_debug.cfs_rq:/.util_avg.min
    380.01 ±  9%     -27.2%     276.67 ±  7%  sched_debug.cfs_rq:/.util_avg.stddev
    342.89 ±  3%     -15.9%     288.48 ±  6%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    997.25 ±  3%     -23.2%     766.29 ± 15%  sched_debug.cfs_rq:/.util_est_enqueued.max
    294.50 ±  4%     -30.6%     204.35 ± 14%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    534028 ±  9%     -37.7%     332865 ± 17%  sched_debug.cpu.avg_idle.avg
    896297 ±  5%     -29.2%     634560 ±  8%  sched_debug.cpu.avg_idle.max
    260841 ±  4%     -28.5%     186459 ±  8%  sched_debug.cpu.avg_idle.stddev
      1597 ±  2%     +31.5%       2099 ±  5%  sched_debug.cpu.curr->pid.avg
      0.95 ± 11%     +39.9%       1.33 ± 11%  sched_debug.cpu.nr_running.avg
      0.59 ± 11%     +20.8%       0.71 ±  9%  sched_debug.cpu.nr_running.stddev
    525011 ±  8%     +16.5%     611478 ±  2%  sched_debug.cpu.nr_switches.stddev
     36.67 ± 12%     +51.6%      55.58 ± 18%  sched_debug.cpu.nr_uninterruptible.max
    -37.75           +68.8%     -63.73        sched_debug.cpu.nr_uninterruptible.min
     20.71 ± 16%     +52.8%      31.65 ±  9%  sched_debug.cpu.nr_uninterruptible.stddev
    524055 ±  8%     +16.4%     610231 ±  2%  sched_debug.cpu.sched_count.stddev
    179082 ± 15%    +431.0%     950904 ±  4%  sched_debug.cpu.sched_goidle.avg
    941875 ±  8%     +90.8%    1797084        sched_debug.cpu.sched_goidle.max
     57580 ± 47%   +1330.6%     823745 ±  6%  sched_debug.cpu.sched_goidle.min
    261290 ±  8%     +16.8%     305219 ±  2%  sched_debug.cpu.sched_goidle.stddev
   2351907 ±  2%     -22.8%    1815202 ±  5%  sched_debug.cpu.ttwu_count.avg
   3121048 ±  4%     -14.7%    2661346 ±  2%  sched_debug.cpu.ttwu_count.max
   2204420 ±  2%     -23.6%    1683662 ±  6%  sched_debug.cpu.ttwu_count.min
    263284 ±  8%     +15.9%     305082 ±  3%  sched_debug.cpu.ttwu_count.stddev
   2145030 ±  3%     -94.7%     113807 ±  2%  sched_debug.cpu.ttwu_local.avg
   2170062 ±  3%     -94.2%     125644 ± 10%  sched_debug.cpu.ttwu_local.max
   2117317 ±  3%     -94.8%     109613 ±  3%  sched_debug.cpu.ttwu_local.min
     13155 ± 17%     -70.2%       3923 ± 89%  sched_debug.cpu.ttwu_local.stddev
      8071 ±  4%     +15.3%       9304 ± 11%  softirqs.CPU0.RCU
     13782 ± 12%    +105.3%      28301 ±  6%  softirqs.CPU0.SCHED
     57292 ±  5%     +20.8%      69210 ±  6%  softirqs.CPU0.TIMER
     10409 ± 12%    +146.9%      25705 ±  5%  softirqs.CPU1.SCHED
      8100 ±  2%     +18.5%       9598 ± 10%  softirqs.CPU10.RCU
     10535 ±  9%    +131.8%      24423 ±  7%  softirqs.CPU10.SCHED
     63176 ±  7%     +12.3%      70916 ±  4%  softirqs.CPU10.TIMER
     10881 ±  9%    +139.9%      26107 ±  8%  softirqs.CPU11.SCHED
     59833 ±  4%     +19.4%      71424 ± 14%  softirqs.CPU11.TIMER
     10829 ±  2%    +129.5%      24855 ±  8%  softirqs.CPU12.SCHED
     56830 ±  4%     +21.9%      69297 ± 10%  softirqs.CPU12.TIMER
      8338 ±  7%     +15.6%       9636 ± 13%  softirqs.CPU13.RCU
      9965 ± 19%    +164.7%      26377 ±  8%  softirqs.CPU13.SCHED
      7913 ±  5%     +18.8%       9400 ± 13%  softirqs.CPU14.RCU
     10301 ± 13%    +137.8%      24499 ± 13%  softirqs.CPU14.SCHED
     58108 ±  4%     +23.3%      71658 ±  3%  softirqs.CPU14.TIMER
     10402 ±  9%    +141.5%      25126 ±  7%  softirqs.CPU15.SCHED
     60348 ±  2%     +17.2%      70757 ± 10%  softirqs.CPU15.TIMER
     12252 ±  8%    +116.1%      26482 ±  5%  softirqs.CPU2.SCHED
     60066 ±  4%     +21.8%      73163 ± 12%  softirqs.CPU2.TIMER
      8195 ±  4%     +14.2%       9358 ±  7%  softirqs.CPU3.RCU
     11267 ± 17%    +123.3%      25161 ±  3%  softirqs.CPU3.SCHED
     57990 ±  5%     +25.0%      72490 ± 13%  softirqs.CPU3.TIMER
     11242 ± 12%    +134.0%      26309 ±  9%  softirqs.CPU4.SCHED
     62003 ±  7%     +18.1%      73208        softirqs.CPU4.TIMER
     10623 ± 21%    +150.0%      26564 ±  8%  softirqs.CPU5.SCHED
     64978 ±  9%     +16.4%      75636 ±  9%  softirqs.CPU5.TIMER
      9820 ± 20%    +159.9%      25525 ±  7%  softirqs.CPU6.SCHED
     60525 ± 13%     +25.3%      75836 ±  8%  softirqs.CPU6.TIMER
      8310 ±  9%     +15.4%       9587 ±  7%  softirqs.CPU7.RCU
     10202 ±  6%    +155.3%      26044 ± 11%  softirqs.CPU7.SCHED
     61012           +20.3%      73383 ±  8%  softirqs.CPU7.TIMER
      7810 ±  6%     +28.2%      10012 ± 12%  softirqs.CPU8.RCU
     11163 ±  3%    +135.9%      26339 ± 11%  softirqs.CPU8.SCHED
     57386 ±  5%     +22.0%      70029 ±  7%  softirqs.CPU8.TIMER
     11643 ± 15%    +121.4%      25780 ±  3%  softirqs.CPU9.SCHED
    138325 ±  2%     +13.1%     156403 ± 10%  softirqs.RCU
    175325 ±  2%    +135.9%     413606 ±  6%  softirqs.SCHED
    967685 ±  3%     +18.5%    1146565 ±  7%  softirqs.TIMER
     21179 ±  4%     +15.9%      24541 ±  6%  interrupts.CAL:Function_call_interrupts
      1346 ±  4%     +16.0%       1561 ± 10%  interrupts.CPU0.CAL:Function_call_interrupts
    255151 ±  7%     +33.5%     340712 ±  7%  interrupts.CPU0.LOC:Local_timer_interrupts
      1597 ± 17%    +536.7%      10169 ±  3%  interrupts.CPU0.RES:Rescheduling_interrupts
    262988 ±  3%     +29.4%     340284 ±  7%  interrupts.CPU1.LOC:Local_timer_interrupts
      1255 ± 15%    +656.0%       9492        interrupts.CPU1.RES:Rescheduling_interrupts
    259958 ±  5%     +31.0%     340654 ±  7%  interrupts.CPU10.LOC:Local_timer_interrupts
      1448 ± 21%    +526.9%       9079        interrupts.CPU10.RES:Rescheduling_interrupts
    264562 ±  2%     +28.7%     340366 ±  7%  interrupts.CPU11.LOC:Local_timer_interrupts
      1176 ±  9%    +682.6%       9207 ±  2%  interrupts.CPU11.RES:Rescheduling_interrupts
      1202 ± 17%     +31.8%       1584 ±  3%  interrupts.CPU12.CAL:Function_call_interrupts
    256269 ±  7%     +33.0%     340874 ±  7%  interrupts.CPU12.LOC:Local_timer_interrupts
      1240 ± 11%    +644.6%       9232        interrupts.CPU12.RES:Rescheduling_interrupts
    258695 ±  6%     +31.8%     340891 ±  7%  interrupts.CPU13.LOC:Local_timer_interrupts
      1329 ± 34%    +605.9%       9383        interrupts.CPU13.RES:Rescheduling_interrupts
      1161 ± 16%     +39.3%       1618 ±  7%  interrupts.CPU14.CAL:Function_call_interrupts
    259395 ±  5%     +31.3%     340636 ±  7%  interrupts.CPU14.LOC:Local_timer_interrupts
      1313 ± 26%    +603.5%       9240        interrupts.CPU14.RES:Rescheduling_interrupts
    256597 ±  7%     +32.6%     340354 ±  7%  interrupts.CPU15.LOC:Local_timer_interrupts
      1293 ± 10%    +615.1%       9250 ±  2%  interrupts.CPU15.RES:Rescheduling_interrupts
    259187 ±  5%     +31.2%     340034 ±  7%  interrupts.CPU2.LOC:Local_timer_interrupts
      1482 ± 18%    +566.1%       9873 ±  3%  interrupts.CPU2.RES:Rescheduling_interrupts
    264129 ±  2%     +28.8%     340116 ±  7%  interrupts.CPU3.LOC:Local_timer_interrupts
      1387 ± 19%    +584.8%       9503        interrupts.CPU3.RES:Rescheduling_interrupts
      1249 ± 19%     +19.5%       1493 ±  3%  interrupts.CPU4.CAL:Function_call_interrupts
    256073 ±  7%     +32.8%     340112 ±  7%  interrupts.CPU4.LOC:Local_timer_interrupts
      1372 ± 18%    +594.0%       9521 ±  4%  interrupts.CPU4.RES:Rescheduling_interrupts
    258411 ±  6%     +31.6%     340115 ±  7%  interrupts.CPU5.LOC:Local_timer_interrupts
      1267 ± 17%    +655.9%       9582 ±  2%  interrupts.CPU5.RES:Rescheduling_interrupts
    258716 ±  5%     +31.3%     339719 ±  7%  interrupts.CPU6.LOC:Local_timer_interrupts
      1220 ±  7%    +666.4%       9350 ±  2%  interrupts.CPU6.RES:Rescheduling_interrupts
      1192 ± 15%     +33.0%       1586 ±  8%  interrupts.CPU7.CAL:Function_call_interrupts
    255847 ±  7%     +32.8%     339836 ±  7%  interrupts.CPU7.LOC:Local_timer_interrupts
      1217 ± 13%    +687.4%       9588 ±  2%  interrupts.CPU7.RES:Rescheduling_interrupts
      1369 ±  5%     +14.4%       1566 ±  6%  interrupts.CPU8.CAL:Function_call_interrupts
    256492 ±  7%     +32.9%     340775 ±  7%  interrupts.CPU8.LOC:Local_timer_interrupts
      1222 ± 11%    +688.9%       9640 ±  7%  interrupts.CPU8.RES:Rescheduling_interrupts
    263589 ±  3%     +29.1%     340310 ±  7%  interrupts.CPU9.LOC:Local_timer_interrupts
      1486 ± 22%    +518.1%       9189 ±  2%  interrupts.CPU9.RES:Rescheduling_interrupts
   4146065 ±  5%     +31.3%    5445793 ±  7%  interrupts.LOC:Local_timer_interrupts
     21310 ± 11%    +610.0%     151304        interrupts.RES:Rescheduling_interrupts
     41.41            -8.2%      38.01 ±  3%  perf-stat.i.MPKI
 3.181e+09            -6.2%  2.984e+09 ±  6%  perf-stat.i.branch-instructions
      1.81 ±  2%      -0.1        1.69        perf-stat.i.branch-miss-rate%
  41240020           -15.2%   34981946 ±  8%  perf-stat.i.branch-misses
 5.891e+08 ±  2%     -10.6%  5.264e+08 ±  6%  perf-stat.i.cache-misses
 5.891e+08 ±  2%     -10.6%  5.264e+08 ±  6%  perf-stat.i.cache-references
    549435 ±  2%     -17.1%     455328 ±  7%  perf-stat.i.context-switches
      1.76            +9.8%       1.93        perf-stat.i.cpi
      8164 ± 30%   +1194.2%     105659 ±  7%  perf-stat.i.cpu-migrations
      0.10 ±  6%      +0.2        0.27 ±  3%  perf-stat.i.dTLB-load-miss-rate%
   3249270 ±  8%    +223.7%   10518062 ±  6%  perf-stat.i.dTLB-load-misses
  4.53e+09           -11.1%  4.025e+09 ±  5%  perf-stat.i.dTLB-loads
      0.05 ±  6%      +0.0        0.09 ±  3%  perf-stat.i.dTLB-store-miss-rate%
   1269803 ±  4%     +34.4%    1706370 ±  7%  perf-stat.i.dTLB-store-misses
 2.468e+09           -22.9%  1.904e+09 ±  6%  perf-stat.i.dTLB-stores
     51.76 ±  6%      +6.1       57.84 ±  2%  perf-stat.i.iTLB-load-miss-rate%
   2394143 ± 12%     +28.6%    3079996 ±  5%  perf-stat.i.iTLB-load-misses
   2553375 ±  2%     +46.1%    3730425 ±  6%  perf-stat.i.iTLB-loads
 1.464e+10            -8.7%  1.337e+10 ±  5%  perf-stat.i.instructions
      0.62            -9.2%       0.56        perf-stat.i.ipc
      4962           -15.3%       4204 ±  5%  perf-stat.i.minor-faults
      4962           -15.3%       4204 ±  5%  perf-stat.i.page-faults
     40.24            -2.1%      39.38        perf-stat.overall.MPKI
      1.30 ±  2%      -0.1        1.17 ±  2%  perf-stat.overall.branch-miss-rate%
      1.73           +11.8%       1.93        perf-stat.overall.cpi
     43.00           +14.2%      49.10        perf-stat.overall.cycles-between-cache-misses
      0.07 ±  9%      +0.2        0.26 ±  2%  perf-stat.overall.dTLB-load-miss-rate%
      0.05 ±  4%      +0.0        0.09        perf-stat.overall.dTLB-store-miss-rate%
     48.21 ±  5%      -3.0       45.24        perf-stat.overall.iTLB-load-miss-rate%
      6211 ± 12%     -30.1%       4340        perf-stat.overall.instructions-per-iTLB-miss
      0.58           -10.5%       0.52        perf-stat.overall.ipc
 3.157e+09            -6.0%  2.966e+09 ±  6%  perf-stat.ps.branch-instructions
  40940076           -15.1%   34774814 ±  8%  perf-stat.ps.branch-misses
 5.846e+08 ±  2%     -10.5%  5.232e+08 ±  5%  perf-stat.ps.cache-misses
 5.846e+08 ±  2%     -10.5%  5.232e+08 ±  5%  perf-stat.ps.cache-references
    545185 ±  2%     -17.0%     452472 ±  7%  perf-stat.ps.context-switches
      8100 ± 30%   +1196.1%     104990 ±  7%  perf-stat.ps.cpu-migrations
   3224891 ±  8%    +224.1%   10452308 ±  6%  perf-stat.ps.dTLB-load-misses
 4.495e+09           -11.0%      4e+09 ±  5%  perf-stat.ps.dTLB-loads
   1260065 ±  4%     +34.6%    1695722 ±  7%  perf-stat.ps.dTLB-store-misses
 2.449e+09           -22.7%  1.892e+09 ±  6%  perf-stat.ps.dTLB-stores
   2375668 ± 12%     +28.8%    3060737 ±  5%  perf-stat.ps.iTLB-load-misses
   2533621 ±  2%     +46.3%    3706931 ±  6%  perf-stat.ps.iTLB-loads
 1.453e+10            -8.5%  1.328e+10 ±  5%  perf-stat.ps.instructions
      4930           -15.2%       4181 ±  5%  perf-stat.ps.minor-faults
      4931           -15.2%       4181 ±  5%  perf-stat.ps.page-faults
 1.964e+12 ±  2%     +14.3%  2.244e+12        perf-stat.total.instructions
     10.67 ± 18%     -10.7        0.00        perf-profile.calltrace.cycles-pp.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common
     10.60 ± 18%     -10.6        0.00        perf-profile.calltrace.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up.autoremove_wake_function
     21.89 ±  6%      -9.8       12.08 ± 17%  perf-profile.calltrace.cycles-pp.copy_page_to_iter.pipe_read.new_sync_read.vfs_read.ksys_read
     21.03 ±  4%      -9.4       11.63 ± 33%  perf-profile.calltrace.cycles-pp.copy_page_from_iter.pipe_write.new_sync_write.vfs_write.ksys_write
      8.95 ± 18%      -9.0        0.00        perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up
     18.52 ±  5%      -8.8        9.70 ± 32%  perf-profile.calltrace.cycles-pp.copyin.copy_page_from_iter.pipe_write.new_sync_write.vfs_write
     18.21 ±  5%      -8.6        9.58 ± 32%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.copy_page_from_iter.pipe_write.new_sync_write
     19.21 ±  7%      -8.4       10.80 ± 16%  perf-profile.calltrace.cycles-pp.copyout.copy_page_to_iter.pipe_read.new_sync_read.vfs_read
     18.95 ±  7%      -8.3       10.68 ± 16%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyout.copy_page_to_iter.pipe_read.new_sync_read
      7.84 ± 18%      -7.8        0.00        perf-profile.calltrace.cycles-pp.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate
      7.68 ± 17%      -7.7        0.00        perf-profile.calltrace.cycles-pp.__wake_up_common.pipe_read.new_sync_read.vfs_read.ksys_read
      7.60 ± 17%      -7.6        0.00        perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.pipe_read.new_sync_read.vfs_read
      7.51 ± 17%      -7.5        0.00        perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.pipe_read.new_sync_read
      6.83 ± 17%      -6.8        0.00        perf-profile.calltrace.cycles-pp.__wake_up_common.pipe_write.new_sync_write.vfs_write.ksys_write
      6.51 ± 17%      -6.5        0.00        perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.pipe_write.new_sync_write.vfs_write
      6.43 ± 17%      -6.4        0.00        perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.pipe_write.new_sync_write
      6.27 ± 18%      -6.3        0.00        perf-profile.calltrace.cycles-pp.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task
      5.90 ± 18%      -5.9        0.00        perf-profile.calltrace.cycles-pp.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair
      5.77 ± 18%      -5.8        0.00        perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.pipe_read
      4.95 ± 18%      -4.9        0.00        perf-profile.calltrace.cycles-pp.ttwu_do_activate.try_to_wake_up.autoremove_wake_function.__wake_up_common.pipe_write
      0.91 ±  4%      -0.5        0.41 ± 58%  perf-profile.calltrace.cycles-pp.alloc_pages_current.pipe_write.new_sync_write.vfs_write.ksys_write
      0.73 ± 11%      +0.4        1.17 ±  5%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.pipe_read.new_sync_read.vfs_read.ksys_read
      0.35 ±173%      +3.3        3.62 ± 45%  perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page.pipe_read.new_sync_read.vfs_read
      0.00           +14.4       14.43 ±154%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
      0.00           +15.1       15.06 ±154%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
      0.00           +15.1       15.13 ±154%  perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      2.68 ±111%     +15.2       17.90 ± 37%  perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.pipe_write.new_sync_write.vfs_write
      2.72 ±111%     +15.5       18.19 ± 37%  perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_write.new_sync_write.vfs_write.ksys_write
      0.00           +15.7       15.75 ±153%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      0.00           +15.8       15.75 ±153%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
      0.00           +15.8       15.75 ±153%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
      0.00           +16.2       16.17 ±153%  perf-profile.calltrace.cycles-pp.secondary_startup_64
      2.86 ±113%     +16.3       19.18 ± 24%  perf-profile.calltrace.cycles-pp.mutex_spin_on_owner.__mutex_lock.pipe_read.new_sync_read.vfs_read
      2.90 ±113%     +16.5       19.44 ± 24%  perf-profile.calltrace.cycles-pp.__mutex_lock.pipe_read.new_sync_read.vfs_read.ksys_read
     37.47 ±  6%     -17.1       20.39 ± 23%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
     14.56 ± 17%     -14.1        0.45 ± 51%  perf-profile.children.cycles-pp.__wake_up_common
     14.43 ± 15%     -14.0        0.41 ± 50%  perf-profile.children.cycles-pp.try_to_wake_up
     14.12 ± 17%     -13.8        0.33 ± 57%  perf-profile.children.cycles-pp.autoremove_wake_function
     11.05 ± 16%     -10.8        0.27 ± 57%  perf-profile.children.cycles-pp.ttwu_do_activate
     11.02 ± 16%     -10.7        0.27 ± 57%  perf-profile.children.cycles-pp.activate_task
     10.95 ± 16%     -10.7        0.27 ± 57%  perf-profile.children.cycles-pp.enqueue_task_fair
     22.01 ±  6%      -9.9       12.13 ± 17%  perf-profile.children.cycles-pp.copy_page_to_iter
     21.16 ±  4%      -9.5       11.69 ± 33%  perf-profile.children.cycles-pp.copy_page_from_iter
      9.27 ± 16%      -9.0        0.22 ± 57%  perf-profile.children.cycles-pp.enqueue_entity
     18.59 ±  5%      -8.9        9.73 ± 32%  perf-profile.children.cycles-pp.copyin
     19.27 ±  7%      -8.4       10.84 ± 16%  perf-profile.children.cycles-pp.copyout
      8.10 ± 16%      -7.9        0.19 ± 57%  perf-profile.children.cycles-pp.__account_scheduler_latency
      7.89 ± 15%      -7.3        0.61 ± 47%  perf-profile.children.cycles-pp.pipe_wait
      7.24 ± 14%      -6.9        0.37 ± 48%  perf-profile.children.cycles-pp.__schedule
      7.12 ± 15%      -6.8        0.29 ± 59%  perf-profile.children.cycles-pp.schedule
      6.48 ± 16%      -6.3        0.16 ± 57%  perf-profile.children.cycles-pp.stack_trace_save_tsk
      6.21 ± 16%      -6.0        0.15 ± 57%  perf-profile.children.cycles-pp.arch_stack_walk
      3.81 ± 16%      -3.7        0.09 ± 59%  perf-profile.children.cycles-pp.unwind_next_frame
      2.79 ± 15%      -2.7        0.11 ± 59%  perf-profile.children.cycles-pp.dequeue_task_fair
      2.13 ± 14%      -2.1        0.08 ± 57%  perf-profile.children.cycles-pp.select_task_rq_fair
      1.69 ± 14%      -1.6        0.06 ± 59%  perf-profile.children.cycles-pp.select_idle_sibling
      1.48 ± 15%      -1.4        0.07 ± 58%  perf-profile.children.cycles-pp.update_curr
      1.19 ± 16%      -1.2        0.03 ±100%  perf-profile.children.cycles-pp.reweight_entity
      1.25 ± 12%      -1.2        0.10 ± 59%  perf-profile.children.cycles-pp.pick_next_task_fair
      1.20 ± 12%      -1.1        0.08 ± 58%  perf-profile.children.cycles-pp.update_load_avg
      1.11 ± 16%      -1.1        0.06 ± 63%  perf-profile.children.cycles-pp.dequeue_entity
      1.09 ± 19%      -1.0        0.13 ± 19%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.99 ± 12%      -0.9        0.05 ± 58%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      1.05 ±  6%      -0.6        0.41 ± 38%  perf-profile.children.cycles-pp.security_file_permission
      1.67 ±  3%      -0.6        1.09 ± 46%  perf-profile.children.cycles-pp.__might_fault
      1.81            -0.6        1.24 ± 20%  perf-profile.children.cycles-pp.___might_sleep
      0.74 ±  8%      -0.5        0.26 ± 46%  perf-profile.children.cycles-pp.free_unref_page_prepare
      0.95 ±  4%      -0.5        0.49 ± 32%  perf-profile.children.cycles-pp.alloc_pages_current
      1.38 ±  3%      -0.5        0.91 ± 41%  perf-profile.children.cycles-pp.__might_sleep
      0.79 ±  3%      -0.4        0.35 ± 21%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      0.79 ±  5%      -0.4        0.36 ± 21%  perf-profile.children.cycles-pp._cond_resched
      0.66 ±  4%      -0.4        0.24 ± 39%  perf-profile.children.cycles-pp.selinux_file_permission
      0.51 ± 11%      -0.4        0.10 ± 57%  perf-profile.children.cycles-pp.generic_pipe_buf_confirm
      0.47 ± 45%      -0.4        0.07 ± 58%  perf-profile.children.cycles-pp.__mutex_unlock_slowpath
      0.46 ± 45%      -0.4        0.07 ± 59%  perf-profile.children.cycles-pp.wake_up_q
      0.80 ±  7%      -0.4        0.40 ± 32%  perf-profile.children.cycles-pp.free_unref_page_commit
      0.68 ±  8%      -0.4        0.31 ± 20%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.40 ±  8%      -0.3        0.10 ± 79%  perf-profile.children.cycles-pp.native_write_msr
      0.43 ±  8%      -0.3        0.16 ± 19%  perf-profile.children.cycles-pp.__fdget_pos
      0.41 ±  7%      -0.3        0.15 ± 20%  perf-profile.children.cycles-pp.__fget_light
      0.45 ±  9%      -0.2        0.20 ± 25%  perf-profile.children.cycles-pp.__put_page
      0.42 ±  7%      -0.2        0.18 ± 13%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.36 ±  3%      -0.2        0.17 ± 28%  perf-profile.children.cycles-pp.rcu_all_qs
      0.28 ± 12%      -0.2        0.10 ± 58%  perf-profile.children.cycles-pp.file_has_perm
      0.40 ±  9%      -0.2        0.24 ± 14%  perf-profile.children.cycles-pp.mutex_lock
      0.23 ±  9%      -0.1        0.08 ± 58%  perf-profile.children.cycles-pp.fsnotify
      0.21 ± 14%      -0.1        0.08 ± 57%  perf-profile.children.cycles-pp.avc_has_perm
      0.26 ± 10%      -0.1        0.13 ± 57%  perf-profile.children.cycles-pp.__inc_numa_state
      0.15 ± 11%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.__fsnotify_parent
      0.21 ±  8%      -0.1        0.09 ± 57%  perf-profile.children.cycles-pp.get_task_policy
      0.20 ±  6%      -0.1        0.09 ± 27%  perf-profile.children.cycles-pp.__page_cache_release
      0.17 ± 15%      -0.1        0.06 ± 58%  perf-profile.children.cycles-pp.mem_cgroup_uncharge
      0.21 ±  2%      -0.1        0.12 ± 28%  perf-profile.children.cycles-pp.touch_atime
      0.15 ±  9%      -0.1        0.06 ± 58%  perf-profile.children.cycles-pp.free_pcp_prepare
      0.13 ± 20%      -0.1        0.05 ± 58%  perf-profile.children.cycles-pp.prep_new_page
      0.12 ±  8%      -0.1        0.06 ± 58%  perf-profile.children.cycles-pp.policy_nodemask
      0.13 ±  9%      -0.1        0.07 ± 58%  perf-profile.children.cycles-pp.atime_needs_update
      0.11 ±  9%      -0.1        0.05 ± 58%  perf-profile.children.cycles-pp.file_update_time
      0.10 ±  7%      -0.1        0.04 ± 58%  perf-profile.children.cycles-pp.__inode_security_revalidate
      0.10 ± 15%      -0.1        0.04 ± 58%  perf-profile.children.cycles-pp.should_fail_alloc_page
      0.09 ±  8%      -0.1        0.04 ± 58%  perf-profile.children.cycles-pp.policy_node
      0.09 ±  7%      -0.0        0.05 ± 58%  perf-profile.children.cycles-pp.current_time
      0.14 ±  5%      -0.0        0.12 ±  9%  perf-profile.children.cycles-pp.__wake_up_common_lock
      0.07 ± 17%      +0.1        0.13 ± 45%  perf-profile.children.cycles-pp.scheduler_tick
      0.00            +0.1        0.08 ± 26%  perf-profile.children.cycles-pp.osq_lock
      0.03 ±173%      +0.2        0.20 ± 40%  perf-profile.children.cycles-pp.__mod_zone_page_state
      0.00            +0.3        0.27 ±130%  perf-profile.children.cycles-pp.irq_exit
      0.00            +0.4        0.42 ±125%  perf-profile.children.cycles-pp.start_kernel
      1.56 ±  6%      +0.5        2.08 ± 12%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.34 ±  8%      +0.7        1.09 ±108%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
      0.49 ±110%      +3.2        3.65 ± 44%  perf-profile.children.cycles-pp.free_pcppages_bulk
      0.08 ±133%     +14.4       14.51 ±153%  perf-profile.children.cycles-pp.intel_idle
      0.09 ±132%     +15.4       15.54 ±153%  perf-profile.children.cycles-pp.cpuidle_enter_state
      0.09 ±132%     +15.5       15.54 ±153%  perf-profile.children.cycles-pp.cpuidle_enter
      0.10 ±134%     +15.7       15.75 ±153%  perf-profile.children.cycles-pp.start_secondary
      0.11 ±132%     +16.1       16.17 ±153%  perf-profile.children.cycles-pp.secondary_startup_64
      0.11 ±132%     +16.1       16.17 ±153%  perf-profile.children.cycles-pp.cpu_startup_entry
      0.11 ±132%     +16.1       16.18 ±153%  perf-profile.children.cycles-pp.do_idle
      5.57 ±112%     +31.8       37.39 ± 31%  perf-profile.children.cycles-pp.mutex_spin_on_owner
      5.83 ±110%     +32.1       37.97 ± 30%  perf-profile.children.cycles-pp.__mutex_lock
     37.23 ±  6%     -17.0       20.18 ± 24%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      1.59 ± 15%      -1.5        0.04 ± 58%  perf-profile.self.cycles-pp.unwind_next_frame
      2.09 ±  6%      -1.1        0.99 ± 18%  perf-profile.self.cycles-pp.do_syscall_64
      3.12            -0.9        2.21 ± 43%  perf-profile.self.cycles-pp.get_page_from_freelist
      1.27 ±  5%      -0.7        0.53 ± 38%  perf-profile.self.cycles-pp.pipe_write
      0.74 ± 17%      -0.7        0.03 ±100%  perf-profile.self.cycles-pp.update_curr
      1.21 ± 10%      -0.7        0.53 ± 26%  perf-profile.self.cycles-pp.pipe_read
      0.69 ± 14%      -0.7        0.03 ±100%  perf-profile.self.cycles-pp.__schedule
      1.59 ±  3%      -0.6        1.01 ± 40%  perf-profile.self.cycles-pp.__alloc_pages_nodemask
      1.77            -0.6        1.21 ± 20%  perf-profile.self.cycles-pp.___might_sleep
      1.09 ±  6%      -0.5        0.58 ± 30%  perf-profile.self.cycles-pp.copy_page_to_iter
      0.71 ±  8%      -0.5        0.22 ± 44%  perf-profile.self.cycles-pp.free_unref_page_prepare
      0.61 ± 12%      -0.5        0.12 ± 18%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.79 ±  3%      -0.4        0.35 ± 21%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.99 ±  4%      -0.4        0.59 ± 39%  perf-profile.self.cycles-pp.copy_page_from_iter
      0.46 ± 11%      -0.4        0.08 ± 57%  perf-profile.self.cycles-pp.generic_pipe_buf_confirm
      1.22 ±  3%      -0.4        0.85 ± 41%  perf-profile.self.cycles-pp.__might_sleep
      0.68 ±  8%      -0.4        0.31 ± 20%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.55 ±  6%      -0.4        0.20 ± 37%  perf-profile.self.cycles-pp.selinux_file_permission
      0.69 ±  8%      -0.3        0.36 ± 23%  perf-profile.self.cycles-pp.free_unref_page
      0.43 ±  7%      -0.3        0.11 ± 57%  perf-profile.self.cycles-pp.__wake_up_common
      0.47 ±  5%      -0.3        0.17 ± 40%  perf-profile.self.cycles-pp.__might_fault
      0.39 ±  9%      -0.3        0.10 ± 79%  perf-profile.self.cycles-pp.native_write_msr
      0.55 ±  7%      -0.3        0.28 ± 28%  perf-profile.self.cycles-pp.free_unref_page_commit
      0.53 ±  8%      -0.3        0.26 ± 40%  perf-profile.self.cycles-pp.alloc_pages_current
      0.40 ±  8%      -0.3        0.15 ± 19%  perf-profile.self.cycles-pp.__fget_light
      0.27 ± 12%      -0.2        0.04 ± 59%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.37 ±  8%      -0.2        0.17 ± 10%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
      0.32 ±  3%      -0.2        0.15 ± 30%  perf-profile.self.cycles-pp._cond_resched
      0.22 ±  6%      -0.1        0.08 ± 62%  perf-profile.self.cycles-pp.copyin
      0.17 ±  7%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.new_sync_write
      0.18 ±  9%      -0.1        0.04 ± 58%  perf-profile.self.cycles-pp.new_sync_read
      0.27 ±  5%      -0.1        0.13 ± 36%  perf-profile.self.cycles-pp.rcu_all_qs
      0.21 ±  8%      -0.1        0.08 ± 58%  perf-profile.self.cycles-pp.fsnotify
      0.21 ± 14%      -0.1        0.08 ± 57%  perf-profile.self.cycles-pp.avc_has_perm
      0.14 ±  8%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.__fsnotify_parent
      0.23 ± 10%      -0.1        0.11 ± 57%  perf-profile.self.cycles-pp.__inc_numa_state
      0.20 ±  9%      -0.1        0.09 ± 58%  perf-profile.self.cycles-pp.get_task_policy
      0.18 ±  8%      -0.1        0.09 ± 25%  perf-profile.self.cycles-pp.__page_cache_release
      0.18 ±  4%      -0.1        0.09 ± 26%  perf-profile.self.cycles-pp.copyout
      0.13 ± 13%      -0.1        0.05 ± 58%  perf-profile.self.cycles-pp.mem_cgroup_uncharge
      0.14 ±  6%      -0.1        0.06 ± 58%  perf-profile.self.cycles-pp.free_pcp_prepare
      0.12 ± 21%      -0.1        0.05 ± 58%  perf-profile.self.cycles-pp.prep_new_page
      0.12 ± 14%      -0.1        0.06 ± 14%  perf-profile.self.cycles-pp.__put_page
      0.09 ±  9%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.policy_node
      0.11 ±  7%      -0.1        0.05 ± 58%  perf-profile.self.cycles-pp.policy_nodemask
      0.10 ±  9%      -0.1        0.04 ± 58%  perf-profile.self.cycles-pp.atime_needs_update
      0.09 ± 12%      -0.0        0.04 ± 58%  perf-profile.self.cycles-pp.security_file_permission
      0.00            +0.1        0.08 ± 26%  perf-profile.self.cycles-pp.osq_lock
      0.03 ±173%      +0.2        0.19 ± 42%  perf-profile.self.cycles-pp.__mod_zone_page_state
      0.06 ±128%      +0.3        0.38 ± 25%  perf-profile.self.cycles-pp.__mutex_lock
      1.50 ±  7%      +0.6        2.06 ± 12%  perf-profile.self.cycles-pp._raw_spin_lock_irq
      0.18 ±128%      +1.1        1.29 ± 36%  perf-profile.self.cycles-pp.free_pcppages_bulk
      0.08 ±133%     +14.4       14.50 ±153%  perf-profile.self.cycles-pp.intel_idle
      5.54 ±112%     +31.7       37.20 ± 31%  perf-profile.self.cycles-pp.mutex_spin_on_owner


                                                                                
                       lmbench3.time.voluntary_context_switches                 
                                                                                
  8.5e+07 +-+---------------------------------------------------------------+   
          |          +                                                      |   
    8e+07 +-+        :                                                      |   
  7.5e+07 +-+       : : +         +                                         |   
          |+        : :+ :        :+        .+ .+ .+    .+  +               |   
    7e+07 +-++.  .++  +  +.+   +.+  ++.  .++  +  +  +.++  :+ :      .++.++. |   
  6.5e+07 +-+  ++           +.+        ++                 +  +.+   +       +|   
          |                                                     +. :        |   
    6e+07 +-+                                                     +         |   
  5.5e+07 +-+                                                               |   
          O  O O   O O         O        O O  O                              |   
    5e+07 +-O   O O     O  O  O   O  O O   O  O OO OO OO OO                 |   
  4.5e+07 +-+         O  O  O    O  O                                       |   
          |                                                                 |   
    4e+07 +-+---------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                            lmbench3.PIPE.latency.us                            
                                                                                
  28 +-+--------------------------------------------------------------------+   
     O OO OO O OO O OO OO O OO OO O OO O OO OO O OO OO O O                  |   
  26 +-+                                                                    |   
  24 +-+                                                                    |   
     |                                                                      |   
  22 +-+                                                                    |   
     |                                                                      |   
  20 +-+                                                                    |   
     |                                                                      |   
  18 +-+                                                                    |   
  16 +-+                                                                    |   
     |                                                                      |   
  14 +-++.++.+.++.+.++.++.+.++.++.+.++.+.++.++.+.++.++.+.++.++.+.++.+.++.++.|   
     |                                                                      |   
  12 +-+--------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.4.0-rc2-00012-g3c0edea9b29f9" of type "text/plain" (200609 bytes)

View attachment "job-script" of type "text/plain" (7928 bytes)

View attachment "job.yaml" of type "text/plain" (5658 bytes)

View attachment "reproduce" of type "text/plain" (566 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ