lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20191204095830.GZ18573@shao2-debian>
Date:   Wed, 4 Dec 2019 17:58:30 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     Christoph Hellwig <hch@....de>,
        LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        lkp@...ts.01.org
Subject: [block] 344e9ffcbd:  fsmark.files_per_sec 15.9% improvement

Greeting,

FYI, we noticed a 15.9% improvement of fsmark.files_per_sec due to commit:


commit: 344e9ffcbd1898e1dc04085564a6e05c30ea8199 ("block: add queue_is_mq() helper")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: fsmark
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	iterations: 1x
	nr_threads: 32t
	disk: 1SSD
	fs: xfs
	filesize: 9B
	test_size: 400M
	sync_method: fsyncBeforeClose
	nr_directories: 16d
	nr_files_per_directory: 256fpd
	cpufreq_governor: performance
	ucode: 0x500002b

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/

In addition to that, the commit also has significant impact on the following tests:

+------------------+----------------------------------------------------------------------+
| testcase: change | fileio: iostat.sda.wkB/s 6.1% improvement                            |
| test machine     | 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory |
| test parameters  | cpufreq_governor=performance                                         |
|                  | disk=1HDD                                                            |
|                  | filenum=1024f                                                        |
|                  | fs=xfs                                                               |
|                  | iomode=sync                                                          |
|                  | nr_threads=100%                                                      |
|                  | period=600s                                                          |
|                  | rwmode=rndwr                                                         |
|                  | size=64G                                                             |
|                  | ucode=0xb00002e                                                      |
+------------------+----------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/nr_directories/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
  gcc-7/performance/1SSD/9B/xfs/1x/x86_64-rhel-7.6/16d/256fpd/32t/debian-x86_64-2019-09-23.cgz/fsyncBeforeClose/lkp-csl-2sp7/400M/fsmark/0x500002b

commit: 
  dabcefab45 ("nvme: provide optimized poll function for separate poll queues")
  344e9ffcbd ("block: add queue_is_mq() helper")

dabcefab45d36ecb 344e9ffcbd1898e1dc04085564a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     16845           +15.9%      19530        fsmark.files_per_sec
    122073            -8.5%     111641 ±  2%  fsmark.time.involuntary_context_switches
    231.67 ±  2%      +7.5%     249.00 ±  2%  fsmark.time.percent_of_cpu_this_job_got
     15.09 ±  2%      -5.1%      14.33 ±  2%  fsmark.time.system_time
    542964            -2.6%     528818        fsmark.time.voluntary_context_switches
    429516 ±  3%     -16.7%     357836 ±  6%  meminfo.DirectMap4k
      2.85 ± 11%      -0.5        2.36 ±  8%  mpstat.cpu.all.iowait%
     12.99 ± 15%     +27.9%      16.61 ± 11%  turbostat.CPU%c6
    128324 ± 18%     +15.2%     147810 ± 13%  numa-numastat.node0.local_node
    159384 ± 15%     +12.3%     178965 ± 11%  numa-numastat.node0.numa_hit
      2.41 ± 11%     -17.0%       2.00 ±  7%  iostat.cpu.iowait
      2.41            -6.0%       2.27 ±  8%  iostat.sdb.avgqu-sz
      0.65 ±  2%     +85.2%       1.21 ± 82%  iostat.sdb.w_await.max
     15843 ±141%    -100.0%       1.25 ±131%  softirqs.CPU0.BLOCK
      9608 ±  7%     -10.4%       8613 ±  9%  softirqs.CPU30.TIMER
     10112 ±  5%     -13.0%       8801 ±  4%  softirqs.CPU9.TIMER
      3204 ± 11%     +15.4%       3697 ± 12%  slabinfo.eventpoll_pwq.active_objs
      3204 ± 11%     +15.4%       3697 ± 12%  slabinfo.eventpoll_pwq.num_objs
      8830 ±  3%     +16.8%      10314 ±  3%  slabinfo.kmalloc-1k.active_objs
      8932 ±  3%     +16.7%      10423 ±  3%  slabinfo.kmalloc-1k.num_objs
     67172            -1.4%      66257        proc-vmstat.nr_active_anon
     67077            -1.4%      66118        proc-vmstat.nr_anon_pages
     36011 ±  2%      -6.3%      33759 ±  4%  proc-vmstat.nr_inactive_file
     16948            -2.0%      16603        proc-vmstat.nr_kernel_stack
      1271            -5.1%       1207        proc-vmstat.nr_page_table_pages
     67172            -1.4%      66257        proc-vmstat.nr_zone_active_anon
     36011 ±  2%      -6.3%      33759 ±  4%  proc-vmstat.nr_zone_inactive_file
   1102105            +1.2%    1115271        proc-vmstat.pgpgout
     47857 ± 23%     -50.0%      23941 ± 52%  numa-vmstat.node0.nr_active_anon
     47776 ± 23%     -50.1%      23834 ± 52%  numa-vmstat.node0.nr_anon_pages
     32174 ±  5%     +28.2%      41236 ±  4%  numa-vmstat.node0.nr_dirtied
     33014 ±  5%     +27.5%      42078 ±  4%  numa-vmstat.node0.nr_written
     47857 ± 23%     -50.0%      23941 ± 52%  numa-vmstat.node0.nr_zone_active_anon
     19118 ± 55%    +121.4%      42322 ± 30%  numa-vmstat.node1.nr_active_anon
     19132 ± 55%    +121.0%      42290 ± 30%  numa-vmstat.node1.nr_anon_pages
     11003 ± 20%     -51.7%       5320 ± 34%  numa-vmstat.node1.nr_inactive_file
     19118 ± 55%    +121.4%      42322 ± 30%  numa-vmstat.node1.nr_zone_active_anon
     11003 ± 20%     -51.7%       5320 ± 34%  numa-vmstat.node1.nr_zone_inactive_file
    192100 ± 22%     -50.1%      95812 ± 52%  numa-meminfo.node0.Active
    192100 ± 22%     -50.1%      95766 ± 52%  numa-meminfo.node0.Active(anon)
    116053 ± 13%     -81.1%      21888 ± 87%  numa-meminfo.node0.AnonHugePages
    191689 ± 22%     -50.3%      95338 ± 52%  numa-meminfo.node0.AnonPages
    101256 ± 14%     +12.4%     113765 ± 11%  numa-meminfo.node0.Inactive(file)
     76790 ± 55%    +120.6%     169420 ± 30%  numa-meminfo.node1.Active
     76607 ± 55%    +121.0%     169282 ± 30%  numa-meminfo.node1.Active(anon)
     20934 ± 61%    +461.7%     117580 ± 19%  numa-meminfo.node1.AnonHugePages
     76670 ± 55%    +120.6%     169137 ± 30%  numa-meminfo.node1.AnonPages
     42821 ± 24%     -50.3%      21274 ± 34%  numa-meminfo.node1.Inactive(file)
    975968 ±  3%     +16.2%    1133711 ±  9%  numa-meminfo.node1.MemUsed
      1969 ±  9%     -16.2%       1650 ±  9%  sched_debug.cfs_rq:/.load.avg
      5724 ± 18%     -32.9%       3841 ± 20%  sched_debug.cfs_rq:/.min_vruntime.stddev
     13.90 ± 36%     +51.9%      21.12 ± 34%  sched_debug.cfs_rq:/.removed.load_avg.avg
    115.12 ± 18%     +23.7%     142.38 ± 17%  sched_debug.cfs_rq:/.removed.load_avg.stddev
    641.03 ± 37%     +52.7%     978.57 ± 34%  sched_debug.cfs_rq:/.removed.runnable_sum.avg
      5305 ± 18%     +24.3%       6596 ± 17%  sched_debug.cfs_rq:/.removed.runnable_sum.stddev
      5724 ± 18%     -33.1%       3826 ± 21%  sched_debug.cfs_rq:/.spread0.stddev
      2.02 ± 22%     -27.7%       1.46 ±  7%  sched_debug.cpu.cpu_load[1].avg
      2.57 ± 22%     -35.4%       1.66 ±  4%  sched_debug.cpu.cpu_load[2].avg
      7.94 ± 70%     -58.3%       3.31        sched_debug.cpu.cpu_load[2].stddev
      2.74 ± 17%     -33.4%       1.83 ±  7%  sched_debug.cpu.cpu_load[3].avg
      7.38 ± 49%     -54.4%       3.36 ±  6%  sched_debug.cpu.cpu_load[3].stddev
      2.38 ± 15%     -33.6%       1.58 ± 10%  sched_debug.cpu.cpu_load[4].avg
      9.89 ± 70%      -7.8        2.08 ±173%  perf-profile.calltrace.cycles-pp.path_openat.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.33 ± 79%      -7.3        0.00        perf-profile.calltrace.cycles-pp.may_open.path_openat.do_filp_open.do_sys_open.do_syscall_64
      7.33 ± 79%      -7.3        0.00        perf-profile.calltrace.cycles-pp.security_inode_permission.may_open.path_openat.do_filp_open.do_sys_open
      7.33 ± 79%      -7.3        0.00        perf-profile.calltrace.cycles-pp.selinux_inode_permission.security_inode_permission.may_open.path_openat.do_filp_open
      7.33 ± 79%      -5.2        2.08 ±173%  perf-profile.calltrace.cycles-pp.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
      7.33 ± 79%      -5.2        2.08 ±173%  perf-profile.calltrace.cycles-pp.do_filp_open.do_sys_open.do_syscall_64.entry_SYSCALL_64_after_hwframe
      9.89 ± 70%      -7.8        2.08 ±173%  perf-profile.children.cycles-pp.do_sys_open
      9.89 ± 70%      -7.8        2.08 ±173%  perf-profile.children.cycles-pp.do_filp_open
      9.89 ± 70%      -7.8        2.08 ±173%  perf-profile.children.cycles-pp.path_openat
      7.33 ± 79%      -7.3        0.00        perf-profile.children.cycles-pp.__free_pages_ok
      7.33 ± 79%      -7.3        0.00        perf-profile.children.cycles-pp.free_one_page
      7.33 ± 79%      -7.3        0.00        perf-profile.children.cycles-pp.may_open
      5.90 ± 72%      -5.9        0.00        perf-profile.children.cycles-pp.__sched_text_start
      5.90 ± 72%      -5.9        0.00        perf-profile.children.cycles-pp.schedule
      7.33 ± 79%      -5.2        2.08 ±173%  perf-profile.children.cycles-pp.security_inode_permission
      7.33 ± 79%      -5.2        2.08 ±173%  perf-profile.children.cycles-pp.selinux_inode_permission
  17642600           -20.2%   14085540 ±  9%  perf-stat.i.cache-misses
      4907 ±  2%     -30.6%       3405 ± 13%  perf-stat.i.cpu-migrations
    952.61           +29.3%       1231 ± 14%  perf-stat.i.cycles-between-cache-misses
      7034           -12.4%       6163 ±  9%  perf-stat.i.minor-faults
   6367482           -22.7%    4919325 ±  9%  perf-stat.i.node-load-misses
     84.40            -6.6       77.77 ±  4%  perf-stat.i.node-store-miss-rate%
   2878736           -26.2%    2124511 ±  9%  perf-stat.i.node-store-misses
      7034           -12.4%       6163 ±  9%  perf-stat.i.page-faults
     15.35            -3.6       11.71 ± 12%  perf-stat.overall.cache-miss-rate%
    948.47           +29.6%       1229 ± 13%  perf-stat.overall.cycles-between-cache-misses
      0.01 ±  4%      -0.0        0.01 ± 12%  perf-stat.overall.dTLB-store-miss-rate%
     84.41            -6.4       78.03 ±  4%  perf-stat.overall.node-store-miss-rate%
  13526739 ±  2%     -21.7%   10587792 ± 14%  perf-stat.ps.cache-misses
      3761 ±  2%     -32.2%       2551 ± 14%  perf-stat.ps.cpu-migrations
      5395 ±  3%     -14.4%       4617 ± 11%  perf-stat.ps.minor-faults
   4882260 ±  2%     -24.4%    3691428 ± 12%  perf-stat.ps.node-load-misses
   2207749 ±  3%     -27.7%    1595213 ± 13%  perf-stat.ps.node-store-misses
      5395 ±  3%     -14.4%       4617 ± 11%  perf-stat.ps.page-faults
    586.00 ± 15%     +55.2%     909.25 ± 36%  interrupts.CPU14.CAL:Function_call_interrupts
    105.67 ± 19%    +407.3%     536.00 ±130%  interrupts.CPU19.RES:Rescheduling_interrupts
     88.00 ± 14%     -44.3%      49.00 ± 35%  interrupts.CPU25.RES:Rescheduling_interrupts
     88.33 ± 10%     -39.2%      53.75 ± 22%  interrupts.CPU28.RES:Rescheduling_interrupts
     98.67 ± 41%     -63.0%      36.50 ± 18%  interrupts.CPU31.RES:Rescheduling_interrupts
      1583 ±  8%     -25.2%       1184 ± 18%  interrupts.CPU33.CAL:Function_call_interrupts
     91.67 ± 12%     -59.9%      36.75 ± 20%  interrupts.CPU33.RES:Rescheduling_interrupts
     86.00 ± 16%     -52.9%      40.50 ± 32%  interrupts.CPU34.RES:Rescheduling_interrupts
     94.00 ± 13%     -50.3%      46.75 ± 34%  interrupts.CPU35.RES:Rescheduling_interrupts
     78.33 ± 22%     -46.4%      42.00 ± 38%  interrupts.CPU39.RES:Rescheduling_interrupts
     86.00 ± 15%     -59.0%      35.25 ± 47%  interrupts.CPU40.RES:Rescheduling_interrupts
      1510 ±  8%     -33.1%       1010 ± 26%  interrupts.CPU41.CAL:Function_call_interrupts
     91.00 ± 28%     -52.7%      43.00 ± 23%  interrupts.CPU44.RES:Rescheduling_interrupts
     81.00 ± 23%     -47.5%      42.50 ± 36%  interrupts.CPU74.RES:Rescheduling_interrupts
     75.67 ± 30%     -51.8%      36.50 ± 20%  interrupts.CPU76.RES:Rescheduling_interrupts
     86.33 ± 22%     -57.1%      37.00 ± 23%  interrupts.CPU77.RES:Rescheduling_interrupts
     83.67 ± 17%     -50.7%      41.25 ± 29%  interrupts.CPU78.RES:Rescheduling_interrupts
     92.67 ± 18%     -52.8%      43.75 ± 34%  interrupts.CPU79.RES:Rescheduling_interrupts
     89.33 ± 12%     -43.2%      50.75 ± 27%  interrupts.CPU82.RES:Rescheduling_interrupts
     98.00 ± 22%     -52.6%      46.50 ± 19%  interrupts.CPU83.RES:Rescheduling_interrupts
     97.33 ± 26%     -59.7%      39.25 ± 63%  interrupts.CPU84.RES:Rescheduling_interrupts
    320.00 ±104%     -87.1%      41.25 ± 21%  interrupts.CPU86.RES:Rescheduling_interrupts
     88.33 ± 22%     -45.1%      48.50 ± 43%  interrupts.CPU88.RES:Rescheduling_interrupts
     92.67 ± 15%     -56.8%      40.00 ± 44%  interrupts.CPU89.RES:Rescheduling_interrupts
     88.67 ± 15%     -57.1%      38.00 ± 33%  interrupts.CPU90.RES:Rescheduling_interrupts
     93.67 ± 17%     -54.6%      42.50 ± 35%  interrupts.CPU95.RES:Rescheduling_interrupts


                                                                                
                                fsmark.files_per_sec                            
                                                                                
  25000 +-+-----------------------------------------------------------------+   
        |                                                                   |   
        |                                                                   |   
  20000 O-O O O O O   O O O   O O O O O O O O O   O O O O O                 |   
        |.+.       .+.   .+.    +.       .+.+.+. .+. .+.+   +.+.+.+.+       |   
        |   +   +.+   +.+   +   : +.+   +       +   +   :   :       :   +.+.|   
  15000 +-+ :   :           :   :   :   :               :   :       :   :   |   
        |   :   :           :  :    :   :                : :         :  :   |   
  10000 +-+  : :             : :     : :                 : :         : :    |   
        |    : :             : :     : :                 : :         : :    |   
        |    : :             : :     : :                 : :         : :    |   
   5000 +-+  : :             : :     : :                 : :         : :    |   
        |     :               :       :                   :           :     |   
        |     :               :       :                   :           :     |   
      0 +-+---------O-------O-------------------O---------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-bdw-ep3c: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/disk/filenum/fs/iomode/kconfig/nr_threads/period/rootfs/rwmode/size/tbox_group/testcase/ucode:
  gcc-7/performance/1HDD/1024f/xfs/sync/x86_64-rhel-7.2/100%/600s/debian-x86_64-2018-04-03.cgz/rndwr/64G/lkp-bdw-ep3c/fileio/0xb00002e

commit: 
  dabcefab45 ("nvme: provide optimized poll function for separate poll queues")
  344e9ffcbd ("block: add queue_is_mq() helper")

dabcefab45d36ecb 344e9ffcbd1898e1dc04085564a 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1.82 ±  2%     -25.5%       1.35        fileio.request_latency_avg_ms
    146.73            +7.3%     157.47        fileio.requests_per_sec
     11.91 ±  2%      -2.4        9.46 ±  6%  fileio.thread_events_stddev%
    515696 ±  3%      +9.4%     564189 ±  4%  fileio.time.involuntary_context_switches
     23.50 ±  2%    +128.7%      53.75 ±  2%  fileio.time.percent_of_cpu_this_job_got
    142.59 ±  3%    +123.8%     319.15 ±  5%  fileio.time.system_time
      1092 ±  4%      -8.3%       1002 ±  3%  meminfo.Dirty
    272.25 ±  4%      -8.5%     249.00 ±  3%  proc-vmstat.nr_dirty
 2.508e+10 ±  7%     +11.3%  2.792e+10 ±  3%  cpuidle.C3.time
 1.291e+08 ±171%     -99.0%    1315279 ±  7%  cpuidle.POLL.time
  18566208 ±169%     -98.2%     326061 ±  8%  cpuidle.POLL.usage
     69.75 ± 14%     +25.4%      87.50 ±  5%  turbostat.Avg_MHz
      4.05 ±  9%      +1.1        5.11 ±  3%  turbostat.Busy%
     46.69 ±  7%      +6.5       53.23 ±  6%  turbostat.C3%
     40.10 ±  5%     -20.3%      31.96        iostat.cpu.idle
     59.55 ±  3%     +13.1%      67.37        iostat.cpu.iowait
      1495            +2.7%       1536        iostat.sda.w/s
      2660            +6.1%       2822        iostat.sda.wkB/s
     40.00 ±  5%      -8.2       31.85        mpstat.cpu.idle%
     59.65 ±  3%      +7.8       67.48        mpstat.cpu.iowait%
      0.29 ±  5%      +0.3        0.61 ±  3%  mpstat.cpu.sys%
      0.02 ± 11%      +0.0        0.03 ±  7%  mpstat.cpu.usr%
     39.75 ±  5%     -21.4%      31.25        vmstat.cpu.id
     59.25 ±  3%     +12.7%      66.75        vmstat.cpu.wa
      2640            +6.1%       2801        vmstat.io.bo
      7595            +5.8%       8037        vmstat.system.cs
 3.002e+08 ±  8%     +21.2%  3.638e+08 ±  3%  perf-stat.cache-misses
 3.865e+12 ± 16%     +26.9%  4.903e+12        perf-stat.cpu-cycles
     88656 ±  8%     +68.8%     149686 ±  9%  perf-stat.cpu-migrations
 1.493e+08 ±  8%     +19.4%  1.782e+08 ±  2%  perf-stat.node-load-misses
     44.19 ±  5%      +7.2       51.37 ±  4%  perf-stat.node-store-miss-rate%
  39430465 ± 11%     +48.0%   58375753 ±  8%  perf-stat.node-store-misses
  49645857 ±  7%     +10.9%   55057560 ±  2%  perf-stat.node-stores
     59410 ±  5%     +17.5%      69831 ±  6%  softirqs.CPU10.RCU
     57650 ±  2%     +13.0%      65170 ±  5%  softirqs.CPU13.RCU
     58102 ± 17%     +24.0%      72018 ±  6%  softirqs.CPU14.RCU
     42002 ±  6%      +6.2%      44625 ±  5%  softirqs.CPU16.RCU
     56696           +16.7%      66176 ±  7%  softirqs.CPU2.RCU
     61052 ±  5%     +16.4%      71041 ±  6%  softirqs.CPU3.RCU
     47619 ± 13%     +15.1%      54819 ±  5%  softirqs.CPU39.RCU
     57484 ±  7%     +17.0%      67267 ±  9%  softirqs.CPU46.RCU
     61275 ±  4%     +15.3%      70680 ±  6%  softirqs.CPU47.RCU
     61345 ±  9%     +14.2%      70038 ±  9%  softirqs.CPU53.RCU
     57258 ±  8%     +11.6%      63882 ±  9%  softirqs.CPU57.RCU
     69231 ±  6%     -17.5%      57085 ±  2%  softirqs.CPU65.SCHED
     56920 ±  6%     +12.6%      64075 ±  7%  softirqs.CPU7.RCU
     58671 ± 12%     +18.0%      69255 ±  4%  softirqs.CPU86.RCU
      1121 ±  6%     +84.8%       2073 ±  2%  sched_debug.cfs_rq:/.exec_clock.avg
    572.89 ± 31%    +168.5%       1538 ±  8%  sched_debug.cfs_rq:/.exec_clock.min
     36052 ±126%     -77.6%       8067 ±  9%  sched_debug.cfs_rq:/.load.avg
   2659824 ±146%     -86.4%     361376        sched_debug.cfs_rq:/.load.max
    298229 ±139%     -83.4%      49377 ±  4%  sched_debug.cfs_rq:/.load.stddev
      7682 ±  7%     +27.0%       9759 ± 11%  sched_debug.cfs_rq:/.min_vruntime.avg
      2265 ± 36%     +52.3%       3449 ± 30%  sched_debug.cfs_rq:/.min_vruntime.min
      5.65 ±  8%     -19.8%       4.53 ±  8%  sched_debug.cfs_rq:/.runnable_load_avg.avg
    391.08 ±  4%     -10.2%     351.28        sched_debug.cfs_rq:/.runnable_load_avg.max
     44.46 ±  4%     -14.4%      38.06 ±  3%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
     36043 ±126%     -77.9%       7981 ± 10%  sched_debug.cfs_rq:/.runnable_weight.avg
   2659107 ±146%     -86.5%     359609 ±  2%  sched_debug.cfs_rq:/.runnable_weight.max
    298182 ±139%     -83.5%      49072 ±  5%  sched_debug.cfs_rq:/.runnable_weight.stddev
    387.93 ±  4%      -9.4%     351.32        sched_debug.cpu.cpu_load[0].max
     84205 ± 51%     -90.3%       8137 ± 11%  sched_debug.cpu.load.avg
   6983828 ± 54%     -94.5%     384493 ± 11%  sched_debug.cpu.load.max
    751762 ± 53%     -93.2%      51250 ± 10%  sched_debug.cpu.load.stddev
     13045 ± 30%     +48.5%      19371 ±  9%  sched_debug.cpu.nr_switches.min
     30.30 ± 20%    +298.9%     120.86 ± 17%  sched_debug.cpu.nr_uninterruptible.max
    -49.51         +2144.0%      -1110        sched_debug.cpu.nr_uninterruptible.min
     10.55 ± 15%   +1050.5%     121.42 ± 15%  sched_debug.cpu.nr_uninterruptible.stddev
    126088 ± 11%     -19.4%     101651 ±  7%  sched_debug.cpu.sched_count.max
     12664 ± 31%     +49.8%      18977 ±  8%  sched_debug.cpu.sched_count.min
     14752 ± 13%     -27.6%      10686 ± 13%  sched_debug.cpu.sched_count.stddev
      3236 ± 17%     +39.3%       4509 ±  7%  sched_debug.cpu.ttwu_count.min
      2314 ± 20%     +69.2%       3916 ±  3%  sched_debug.cpu.ttwu_local.min
      4.25 ± 30%      -2.6        1.60 ± 30%  perf-profile.calltrace.cycles-pp.__hrtimer_get_next_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
      1.76 ± 30%      -1.2        0.54 ± 62%  perf-profile.calltrace.cycles-pp.__hrtimer_next_event_base.__hrtimer_get_next_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      0.00            +1.0        0.97 ±  9%  perf-profile.calltrace.cycles-pp.submit_bio.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.do_fsync
      0.00            +1.0        0.97 ±  9%  perf-profile.calltrace.cycles-pp.generic_make_request.submit_bio.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync
      0.00            +1.0        0.97 ±  9%  perf-profile.calltrace.cycles-pp.blk_mq_make_request.generic_make_request.submit_bio.submit_bio_wait.blkdev_issue_flush
      0.00            +1.0        0.98 ±  9%  perf-profile.calltrace.cycles-pp.blkdev_issue_flush.xfs_file_fsync.do_fsync.__x64_sys_fsync.do_syscall_64
      0.00            +1.0        0.98 ±  9%  perf-profile.calltrace.cycles-pp.submit_bio_wait.blkdev_issue_flush.xfs_file_fsync.do_fsync.__x64_sys_fsync
      0.00            +1.0        1.04 ± 10%  perf-profile.calltrace.cycles-pp.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.0        1.04 ± 10%  perf-profile.calltrace.cycles-pp.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.0        1.04 ± 10%  perf-profile.calltrace.cycles-pp.xfs_file_fsync.do_fsync.__x64_sys_fsync.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.1        1.06 ± 10%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.00            +1.1        1.06 ± 10%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     35.75 ± 20%     +17.2       52.96 ±  6%  perf-profile.calltrace.cycles-pp.ktime_get.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
     51.24 ± 44%     +18.7       69.89 ±  5%  perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
     42.59 ± 25%     +26.0       68.58 ±  8%  perf-profile.calltrace.cycles-pp.read_tsc.ktime_get.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt
      5.60 ± 46%      -2.7        2.86 ± 13%  perf-profile.children.cycles-pp.__hrtimer_next_event_base
      4.34 ± 31%      -2.6        1.70 ± 29%  perf-profile.children.cycles-pp.__hrtimer_get_next_event
      0.49 ± 62%      -0.3        0.20 ± 14%  perf-profile.children.cycles-pp.rcu_nmi_enter
      0.15 ± 12%      -0.0        0.11 ± 15%  perf-profile.children.cycles-pp.__set_pte_vaddr
      0.23 ±  8%      -0.0        0.20 ±  7%  perf-profile.children.cycles-pp.set_pte_vaddr
      0.01 ±173%      +0.1        0.08 ± 33%  perf-profile.children.cycles-pp.__sched_text_start
      0.01 ±173%      +0.1        0.08 ± 33%  perf-profile.children.cycles-pp.finish_task_switch
      0.32 ± 12%      +0.1        0.47 ± 13%  perf-profile.children.cycles-pp.rcu_nmi_exit
      0.95 ±  9%      +0.2        1.11 ±  5%  perf-profile.children.cycles-pp.native_apic_msr_write
      0.13 ± 36%      +0.3        0.47 ± 14%  perf-profile.children.cycles-pp.blk_insert_flush
      0.10 ± 39%      +0.4        0.51 ± 10%  perf-profile.children.cycles-pp.kblockd_mod_delayed_work_on
      0.10 ± 39%      +0.4        0.51 ± 10%  perf-profile.children.cycles-pp.mod_delayed_work_on
      0.10 ± 39%      +0.4        0.52 ± 11%  perf-profile.children.cycles-pp.blk_mq_run_hw_queue
      0.08 ± 67%      +0.4        0.49 ±  9%  perf-profile.children.cycles-pp.try_to_grab_pending
      0.24 ± 32%      +0.7        0.98 ±  9%  perf-profile.children.cycles-pp.blkdev_issue_flush
      0.24 ± 32%      +0.7        0.98 ±  9%  perf-profile.children.cycles-pp.submit_bio_wait
      0.24 ± 34%      +0.8        0.99 ±  9%  perf-profile.children.cycles-pp.submit_bio
      0.24 ± 34%      +0.8        0.99 ±  9%  perf-profile.children.cycles-pp.generic_make_request
      0.24 ± 35%      +0.8        0.99 ± 10%  perf-profile.children.cycles-pp.blk_mq_make_request
      0.26 ± 31%      +0.8        1.04 ± 10%  perf-profile.children.cycles-pp.__x64_sys_fsync
      0.26 ± 31%      +0.8        1.04 ± 10%  perf-profile.children.cycles-pp.do_fsync
      0.26 ± 31%      +0.8        1.04 ± 10%  perf-profile.children.cycles-pp.xfs_file_fsync
      0.30 ± 30%      +0.8        1.11 ± 10%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.30 ± 30%      +0.8        1.11 ± 10%  perf-profile.children.cycles-pp.do_syscall_64
     27.17 ± 17%     +13.6       40.79 ±  7%  perf-profile.children.cycles-pp.read_tsc
     36.11 ± 11%     +14.6       50.68 ±  6%  perf-profile.children.cycles-pp.ktime_get
      0.94 ± 47%      -0.4        0.50 ± 10%  perf-profile.self.cycles-pp.__hrtimer_next_event_base
      0.49 ± 62%      -0.3        0.20 ± 14%  perf-profile.self.cycles-pp.rcu_nmi_enter
      0.40 ± 33%      -0.2        0.17 ± 23%  perf-profile.self.cycles-pp.__hrtimer_get_next_event
      0.41 ± 57%      -0.2        0.18 ± 40%  perf-profile.self.cycles-pp.__hrtimer_run_queues
      0.15 ± 12%      -0.0        0.11 ± 15%  perf-profile.self.cycles-pp.__set_pte_vaddr
      0.23 ±  8%      -0.0        0.20 ±  7%  perf-profile.self.cycles-pp.set_pte_vaddr
      0.32 ± 12%      +0.1        0.47 ± 13%  perf-profile.self.cycles-pp.rcu_nmi_exit
      0.95 ±  9%      +0.2        1.11 ±  5%  perf-profile.self.cycles-pp.native_apic_msr_write
      5.95 ± 22%      +2.0        7.97 ±  7%  perf-profile.self.cycles-pp.read_tsc
    577915            +5.0%     606855 ±  2%  interrupts.CAL:Function_call_interrupts
    241.50 ± 34%    +146.9%     596.25 ± 16%  interrupts.CPU0.TLB:TLB_shootdowns
    258.50 ± 45%    +149.3%     644.50 ± 15%  interrupts.CPU1.TLB:TLB_shootdowns
    217.75 ± 39%    +158.2%     562.25 ±  9%  interrupts.CPU10.TLB:TLB_shootdowns
    444.25 ± 15%     +21.8%     541.00 ± 14%  interrupts.CPU11.RES:Rescheduling_interrupts
    221.50 ± 35%    +237.2%     747.00 ± 19%  interrupts.CPU11.TLB:TLB_shootdowns
    250.75 ± 36%    +121.1%     554.50 ± 11%  interrupts.CPU12.TLB:TLB_shootdowns
    211.00 ± 36%    +177.1%     584.75 ± 11%  interrupts.CPU13.TLB:TLB_shootdowns
    229.75 ± 28%    +153.6%     582.75 ± 16%  interrupts.CPU14.TLB:TLB_shootdowns
    406.75 ±  6%     +67.9%     683.00 ± 39%  interrupts.CPU15.RES:Rescheduling_interrupts
    257.50 ± 25%    +144.4%     629.25 ± 26%  interrupts.CPU15.TLB:TLB_shootdowns
    251.75 ± 25%    +133.3%     587.25 ± 14%  interrupts.CPU16.TLB:TLB_shootdowns
    259.75 ± 17%    +131.5%     601.25 ±  9%  interrupts.CPU17.TLB:TLB_shootdowns
    235.00 ± 19%    +186.1%     672.25 ± 10%  interrupts.CPU18.TLB:TLB_shootdowns
    284.75 ± 35%    +103.9%     580.50 ± 18%  interrupts.CPU19.TLB:TLB_shootdowns
    251.50 ± 24%    +157.3%     647.00 ± 27%  interrupts.CPU2.TLB:TLB_shootdowns
    212.25 ± 31%    +189.3%     614.00 ±  8%  interrupts.CPU20.TLB:TLB_shootdowns
    164.50 ± 37%    +259.6%     591.50 ± 23%  interrupts.CPU22.TLB:TLB_shootdowns
      6456 ± 11%     +16.6%       7528 ±  3%  interrupts.CPU23.CAL:Function_call_interrupts
    174.25 ± 35%    +236.0%     585.50 ± 15%  interrupts.CPU23.TLB:TLB_shootdowns
    182.50 ± 58%    +218.2%     580.75 ± 15%  interrupts.CPU24.TLB:TLB_shootdowns
    193.50 ± 10%    +170.4%     523.25 ± 19%  interrupts.CPU25.TLB:TLB_shootdowns
    115.00 ± 29%    +279.3%     436.25 ± 16%  interrupts.CPU26.TLB:TLB_shootdowns
      3137 ±  6%    +135.7%       7395 ± 21%  interrupts.CPU27.NMI:Non-maskable_interrupts
      3137 ±  6%    +135.7%       7395 ± 21%  interrupts.CPU27.PMI:Performance_monitoring_interrupts
    128.25 ± 32%    +353.4%     581.50 ± 23%  interrupts.CPU27.TLB:TLB_shootdowns
    114.25 ± 21%    +308.3%     466.50 ± 18%  interrupts.CPU28.TLB:TLB_shootdowns
    153.50 ± 38%    +226.1%     500.50 ± 22%  interrupts.CPU29.TLB:TLB_shootdowns
    487.50 ± 11%     +38.1%     673.00 ±  7%  interrupts.CPU3.RES:Rescheduling_interrupts
    269.25 ± 42%    +123.5%     601.75 ± 14%  interrupts.CPU3.TLB:TLB_shootdowns
    179.75 ± 47%    +223.9%     582.25 ± 33%  interrupts.CPU30.TLB:TLB_shootdowns
    246.00 ± 37%    +111.0%     519.00 ± 18%  interrupts.CPU31.TLB:TLB_shootdowns
      6956            +6.2%       7388 ±  3%  interrupts.CPU32.CAL:Function_call_interrupts
    179.00 ± 28%    +165.4%     475.00 ± 21%  interrupts.CPU32.TLB:TLB_shootdowns
    283.50 ± 12%     +93.4%     548.25 ± 28%  interrupts.CPU33.TLB:TLB_shootdowns
    153.50 ± 17%    +210.9%     477.25 ± 20%  interrupts.CPU35.TLB:TLB_shootdowns
      3439 ± 44%     +52.1%       5230 ± 25%  interrupts.CPU36.NMI:Non-maskable_interrupts
      3439 ± 44%     +52.1%       5230 ± 25%  interrupts.CPU36.PMI:Performance_monitoring_interrupts
    276.75 ± 14%     +31.9%     365.00 ±  7%  interrupts.CPU36.RES:Rescheduling_interrupts
    190.50 ± 36%    +151.7%     479.50 ± 24%  interrupts.CPU37.TLB:TLB_shootdowns
    268.25 ±  6%     +33.6%     358.50 ± 12%  interrupts.CPU38.RES:Rescheduling_interrupts
    153.75 ± 50%    +278.4%     581.75 ± 28%  interrupts.CPU38.TLB:TLB_shootdowns
    155.25 ± 43%    +256.0%     552.75 ± 14%  interrupts.CPU39.TLB:TLB_shootdowns
    527.75 ± 12%     +25.7%     663.50 ± 17%  interrupts.CPU4.RES:Rescheduling_interrupts
    165.50 ± 39%    +201.1%     498.25 ± 16%  interrupts.CPU40.TLB:TLB_shootdowns
    183.00 ± 61%    +194.1%     538.25 ± 22%  interrupts.CPU41.TLB:TLB_shootdowns
    266.75 ± 33%     +70.1%     453.75 ±  6%  interrupts.CPU42.TLB:TLB_shootdowns
    416.25 ± 13%     +24.7%     519.25 ±  7%  interrupts.CPU43.RES:Rescheduling_interrupts
    130.00 ± 20%    +386.7%     632.75 ± 18%  interrupts.CPU43.TLB:TLB_shootdowns
    499.00 ± 12%     +45.5%     726.25 ± 24%  interrupts.CPU44.RES:Rescheduling_interrupts
    284.75 ± 23%    +137.3%     675.75 ± 26%  interrupts.CPU44.TLB:TLB_shootdowns
    448.50 ±  8%     +25.5%     562.75 ±  5%  interrupts.CPU45.RES:Rescheduling_interrupts
    276.00 ± 39%    +148.3%     685.25 ± 35%  interrupts.CPU45.TLB:TLB_shootdowns
    357.00 ± 49%     +94.5%     694.25 ± 17%  interrupts.CPU46.TLB:TLB_shootdowns
    352.00 ± 21%    +116.4%     761.75 ± 18%  interrupts.CPU47.TLB:TLB_shootdowns
    288.50 ± 40%    +143.0%     701.00 ± 16%  interrupts.CPU48.TLB:TLB_shootdowns
    402.50 ± 11%     +29.9%     523.00 ± 12%  interrupts.CPU49.RES:Rescheduling_interrupts
    244.50 ± 20%    +130.7%     564.00 ± 13%  interrupts.CPU49.TLB:TLB_shootdowns
    195.50 ± 43%    +210.5%     607.00 ± 15%  interrupts.CPU5.TLB:TLB_shootdowns
    266.00 ± 33%    +173.6%     727.75 ± 11%  interrupts.CPU50.TLB:TLB_shootdowns
    434.50 ±  9%     +20.3%     522.75 ±  9%  interrupts.CPU51.RES:Rescheduling_interrupts
    241.25 ± 49%    +208.1%     743.25 ± 25%  interrupts.CPU51.TLB:TLB_shootdowns
    230.50 ± 39%    +205.4%     704.00 ± 23%  interrupts.CPU52.TLB:TLB_shootdowns
    228.50 ± 36%    +236.8%     769.50 ± 19%  interrupts.CPU53.TLB:TLB_shootdowns
    245.75 ± 23%    +131.6%     569.25 ± 27%  interrupts.CPU54.TLB:TLB_shootdowns
    294.75 ± 23%    +157.8%     760.00 ± 22%  interrupts.CPU55.TLB:TLB_shootdowns
    265.00 ± 21%    +160.3%     689.75 ± 15%  interrupts.CPU56.TLB:TLB_shootdowns
    399.25 ± 13%     +27.8%     510.25 ± 10%  interrupts.CPU57.RES:Rescheduling_interrupts
    235.25 ± 34%    +179.6%     657.75 ± 12%  interrupts.CPU57.TLB:TLB_shootdowns
    428.00 ±  8%     +23.8%     529.75 ±  7%  interrupts.CPU58.RES:Rescheduling_interrupts
    334.00 ± 31%     +79.6%     600.00 ± 16%  interrupts.CPU58.TLB:TLB_shootdowns
    381.75 ± 12%     +37.3%     524.25 ± 10%  interrupts.CPU59.RES:Rescheduling_interrupts
    258.75 ± 19%    +164.6%     684.75 ± 16%  interrupts.CPU59.TLB:TLB_shootdowns
    262.75 ± 28%    +147.7%     650.75 ± 24%  interrupts.CPU6.TLB:TLB_shootdowns
    261.25 ± 30%    +167.1%     697.75 ± 21%  interrupts.CPU60.TLB:TLB_shootdowns
    243.50 ± 35%    +179.7%     681.00 ±  8%  interrupts.CPU61.TLB:TLB_shootdowns
    421.25 ±  5%     +20.9%     509.50 ±  3%  interrupts.CPU62.RES:Rescheduling_interrupts
    229.00 ± 25%    +207.6%     704.50 ± 23%  interrupts.CPU62.TLB:TLB_shootdowns
    246.50 ± 46%    +172.4%     671.50 ±  8%  interrupts.CPU63.TLB:TLB_shootdowns
    236.00 ± 28%    +182.0%     665.50 ± 17%  interrupts.CPU64.TLB:TLB_shootdowns
    317.25 ± 25%    +109.1%     663.50 ± 12%  interrupts.CPU65.TLB:TLB_shootdowns
    203.50 ± 30%    +154.9%     518.75 ± 42%  interrupts.CPU66.TLB:TLB_shootdowns
    133.50 ± 53%    +234.3%     446.25 ± 30%  interrupts.CPU68.TLB:TLB_shootdowns
    182.25 ± 17%    +202.5%     551.25 ± 18%  interrupts.CPU69.TLB:TLB_shootdowns
    290.50 ± 20%    +138.0%     691.25 ± 21%  interrupts.CPU7.TLB:TLB_shootdowns
    175.25 ± 34%    +137.1%     415.50 ± 33%  interrupts.CPU70.TLB:TLB_shootdowns
    203.50 ± 30%    +133.4%     475.00 ±  6%  interrupts.CPU71.TLB:TLB_shootdowns
    126.25 ± 67%    +303.4%     509.25 ± 24%  interrupts.CPU72.TLB:TLB_shootdowns
    165.00 ± 16%    +186.8%     473.25 ± 27%  interrupts.CPU73.TLB:TLB_shootdowns
    177.00 ± 51%    +232.9%     589.25 ± 20%  interrupts.CPU74.TLB:TLB_shootdowns
    188.75 ± 47%    +155.2%     481.75 ± 25%  interrupts.CPU75.TLB:TLB_shootdowns
    181.75 ± 42%    +278.3%     687.50 ± 18%  interrupts.CPU76.TLB:TLB_shootdowns
    218.50 ± 18%    +146.3%     538.25 ± 10%  interrupts.CPU77.TLB:TLB_shootdowns
    186.50 ± 57%    +201.5%     562.25 ± 20%  interrupts.CPU78.TLB:TLB_shootdowns
    232.00 ± 26%    +163.0%     610.25 ± 27%  interrupts.CPU8.TLB:TLB_shootdowns
    181.75 ± 46%    +143.6%     442.75 ± 37%  interrupts.CPU80.TLB:TLB_shootdowns
    183.75 ± 46%    +177.6%     510.00 ± 28%  interrupts.CPU81.TLB:TLB_shootdowns
    131.75 ± 49%    +302.3%     530.00 ± 25%  interrupts.CPU82.TLB:TLB_shootdowns
    239.50 ± 25%     +85.2%     443.50 ± 23%  interrupts.CPU83.TLB:TLB_shootdowns
    145.75 ± 37%    +315.3%     605.25 ± 18%  interrupts.CPU84.TLB:TLB_shootdowns
    136.50 ± 25%    +313.6%     564.50 ± 24%  interrupts.CPU85.TLB:TLB_shootdowns
    267.00 ± 16%     +42.2%     379.75 ±  8%  interrupts.CPU86.RES:Rescheduling_interrupts
    255.25 ± 23%     +82.2%     465.00 ± 33%  interrupts.CPU86.TLB:TLB_shootdowns
    259.00 ± 15%     +46.5%     379.50 ±  3%  interrupts.CPU87.RES:Rescheduling_interrupts
    189.50 ± 32%    +209.2%     586.00 ± 12%  interrupts.CPU87.TLB:TLB_shootdowns
    443.25           +30.9%     580.00 ± 13%  interrupts.CPU9.RES:Rescheduling_interrupts
    251.25 ± 21%    +108.2%     523.00 ± 26%  interrupts.CPU9.TLB:TLB_shootdowns
     19144 ± 22%    +165.4%      50805 ± 15%  interrupts.TLB:TLB_shootdowns





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-4.20.0-rc1-00216-g344e9ffcbd189" of type "text/plain" (186385 bytes)

View attachment "job-script" of type "text/plain" (8158 bytes)

View attachment "job.yaml" of type "text/plain" (5804 bytes)

View attachment "reproduce" of type "text/plain" (1057 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ