lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Jan 2019 11:24:44 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Ulf Hansson <ulf.hansson@...aro.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [LKP] [PM]  8234f6734c:  will-it-scale.per_process_ops -3.6%
 regression

Greeting,

FYI, we noticed a -3.6% regression of will-it-scale.per_process_ops due to commit:


commit: 8234f6734c5d74ac794e5517437f51c57d65f865 ("PM-runtime: Switch autosuspend over to using hrtimers")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: will-it-scale
on test machine: 104 threads Skylake with 192G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: poll2
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-skl-fpga01/poll2/will-it-scale

commit: 
  v4.20-rc7
  8234f6734c ("PM-runtime: Switch autosuspend over to using hrtimers")

       v4.20-rc7 8234f6734c5d74ac794e551743 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :2           50%           1:4     dmesg.WARNING:at#for_ip_interrupt_entry/0x
         %stddev     %change         %stddev
             \          |                \  
    240408            -3.6%     231711        will-it-scale.per_process_ops
  25002520            -3.6%   24097991        will-it-scale.workload
    351914            -1.7%     345882        interrupts.CAL:Function_call_interrupts
      1.77 ± 45%      -1.1        0.64        mpstat.cpu.idle%
    106164 ± 24%     -23.2%      81494 ± 28%  numa-meminfo.node0.AnonHugePages
    326430 ±  8%     -11.3%     289513        softirqs.SCHED
      1294            -2.0%       1268        vmstat.system.cs
      3178           +48.4%       4716 ± 16%  slabinfo.eventpoll_pwq.active_objs
      3178           +48.4%       4716 ± 16%  slabinfo.eventpoll_pwq.num_objs
    336.32          -100.0%       0.00        uptime.boot
      3192          -100.0%       0.00        uptime.idle
 3.456e+08 ± 76%     -89.9%   34913819 ± 62%  cpuidle.C1E.time
    747832 ± 72%     -87.5%      93171 ± 45%  cpuidle.C1E.usage
     16209 ± 26%     -38.2%      10021 ± 44%  cpuidle.POLL.time
      6352 ± 32%     -39.5%       3843 ± 48%  cpuidle.POLL.usage
    885259 ±  2%     -13.8%     763434 ±  7%  numa-vmstat.node0.numa_hit
    865117 ±  2%     -13.9%     744992 ±  7%  numa-vmstat.node0.numa_local
    405085 ±  7%     +38.0%     558905 ±  9%  numa-vmstat.node1.numa_hit
    254056 ± 11%     +59.7%     405824 ± 13%  numa-vmstat.node1.numa_local
    738158 ± 73%     -88.5%      85078 ± 47%  turbostat.C1E
      1.07 ± 76%      -1.0        0.11 ± 62%  turbostat.C1E%
      1.58 ± 49%     -65.4%       0.55 ±  6%  turbostat.CPU%c1
      0.15 ± 13%     -35.0%       0.10 ± 38%  turbostat.CPU%c6
    153.97 ± 16%     -54.7       99.31        turbostat.PKG_%
     64141            +1.5%      65072        proc-vmstat.nr_anon_pages
     19541            -7.0%      18178 ±  8%  proc-vmstat.nr_shmem
     18296            +1.1%      18506        proc-vmstat.nr_slab_reclaimable
    713938            -2.3%     697489        proc-vmstat.numa_hit
    693688            -2.4%     677228        proc-vmstat.numa_local
    772220            -1.9%     757334        proc-vmstat.pgalloc_normal
    798565            -1.8%     784042        proc-vmstat.pgfault
    732336            -2.7%     712661        proc-vmstat.pgfree
     20.33 ±  4%      -7.0%      18.92        sched_debug.cfs_rq:/.runnable_load_avg.max
    160603           -44.5%      89108 ± 38%  sched_debug.cfs_rq:/.spread0.avg
    250694           -29.3%     177358 ± 18%  sched_debug.cfs_rq:/.spread0.max
      1109 ±  4%      -7.0%       1031        sched_debug.cfs_rq:/.util_avg.max
     20.33 ±  4%      -7.2%      18.88        sched_debug.cpu.cpu_load[0].max
    -10.00           +35.0%     -13.50        sched_debug.cpu.nr_uninterruptible.min
      3.56 ± 10%     +44.2%       5.14 ± 18%  sched_debug.cpu.nr_uninterruptible.stddev
     87.10 ± 24%     -34.0%      57.44 ± 37%  sched_debug.cpu.sched_goidle.avg
    239.48           -25.6%     178.07 ± 18%  sched_debug.cpu.sched_goidle.stddev
    332.67 ±  7%     -25.5%     247.83 ± 13%  sched_debug.cpu.ttwu_count.min
    231.67 ±  8%     -15.4%     195.96 ± 12%  sched_debug.cpu.ttwu_local.min
     95.47           -95.5        0.00        perf-profile.calltrace.cycles-pp.poll
     90.26           -90.3        0.00        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.poll
     90.08           -90.1        0.00        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
     89.84           -89.8        0.00        perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
     88.04           -88.0        0.00        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe.poll
      2.66            -0.1        2.54        perf-profile.calltrace.cycles-pp._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.90            -0.1        1.81        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string._copy_from_user.do_sys_poll.__x64_sys_poll.do_syscall_64
      2.56            +0.1        2.64        perf-profile.calltrace.cycles-pp.__fdget.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +2.3        2.29        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
      0.00            +2.3        2.34        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
     17.45            +3.8       21.24        perf-profile.calltrace.cycles-pp.__fget_light.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00           +92.7       92.66        perf-profile.calltrace.cycles-pp.do_sys_poll.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00           +94.5       94.51        perf-profile.calltrace.cycles-pp.__x64_sys_poll.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00           +94.8       94.75        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00           +94.9       94.92        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     96.03           -96.0        0.00        perf-profile.children.cycles-pp.poll
     90.29           -90.3        0.00        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     90.11           -90.1        0.00        perf-profile.children.cycles-pp.do_syscall_64
     89.87           -89.9        0.00        perf-profile.children.cycles-pp.__x64_sys_poll
     89.39           -89.4        0.00        perf-profile.children.cycles-pp.do_sys_poll
     16.19           -16.2        0.00        perf-profile.children.cycles-pp.__fget_light
     68.59           -68.6        0.00        perf-profile.self.cycles-pp.do_sys_poll
     14.84           -14.8        0.00        perf-profile.self.cycles-pp.__fget_light
 1.759e+13          -100.0%       0.00        perf-stat.branch-instructions
      0.28            -0.3        0.00        perf-stat.branch-miss-rate%
 4.904e+10          -100.0%       0.00        perf-stat.branch-misses
      6.79 ±  3%      -6.8        0.00        perf-stat.cache-miss-rate%
 1.071e+08 ±  4%    -100.0%       0.00        perf-stat.cache-misses
 1.578e+09          -100.0%       0.00        perf-stat.cache-references
    385311 ±  2%    -100.0%       0.00        perf-stat.context-switches
      1.04          -100.0%       0.00        perf-stat.cpi
 8.643e+13          -100.0%       0.00        perf-stat.cpu-cycles
     13787          -100.0%       0.00        perf-stat.cpu-migrations
      0.00 ±  4%      -0.0        0.00        perf-stat.dTLB-load-miss-rate%
  23324811 ±  5%    -100.0%       0.00        perf-stat.dTLB-load-misses
 1.811e+13          -100.0%       0.00        perf-stat.dTLB-loads
      0.00            -0.0        0.00        perf-stat.dTLB-store-miss-rate%
   2478029          -100.0%       0.00        perf-stat.dTLB-store-misses
 8.775e+12          -100.0%       0.00        perf-stat.dTLB-stores
     99.66           -99.7        0.00        perf-stat.iTLB-load-miss-rate%
 7.527e+09          -100.0%       0.00        perf-stat.iTLB-load-misses
  25540468 ± 39%    -100.0%       0.00        perf-stat.iTLB-loads
  8.33e+13          -100.0%       0.00        perf-stat.instructions
     11066          -100.0%       0.00        perf-stat.instructions-per-iTLB-miss
      0.96          -100.0%       0.00        perf-stat.ipc
    777357          -100.0%       0.00        perf-stat.minor-faults
     81.69           -81.7        0.00        perf-stat.node-load-miss-rate%
  20040093          -100.0%       0.00        perf-stat.node-load-misses
   4491667 ±  7%    -100.0%       0.00        perf-stat.node-loads
     75.23 ± 10%     -75.2        0.00        perf-stat.node-store-miss-rate%
   3418662 ± 30%    -100.0%       0.00        perf-stat.node-store-misses
   1027183 ± 11%    -100.0%       0.00        perf-stat.node-stores
    777373          -100.0%       0.00        perf-stat.page-faults
   3331644          -100.0%       0.00        perf-stat.path-length


                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  242000 +-+----------------------------------------------------------------+   
         |                      +.+..   .+..+.      .+.+..+.+.+.    .+.+..  |   
  240000 +-+                   +     +.+      +.+..+            +..+      +.|   
  238000 +-+..+.+.  .+.   .+..+                                             |   
         |        +.   +.+                                                  |   
  236000 +-+                                                                |   
         |                                                                  |   
  234000 +-+                                                                |   
         |                                  O O O  O                        |   
  232000 +-+             O O  O O                      O  O O O O  O O O  O |   
  230000 +-+           O          O  O O O           O                      |   
         |           O                                                      |   
  228000 O-+    O O                                                         |   
         | O  O                                                             |   
  226000 +-+----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
  2.52e+07 +-+--------------------------------------------------------------+   
           |                     +..+.   .+..+.      .+. .+.+..+.   .+..+.  |   
   2.5e+07 +-+                  +     +.+      +.+.+.   +        +.+      +.|   
  2.48e+07 +-+.+..+. .+.    .+.+                                            |   
           |        +   +..+                                                |   
  2.46e+07 +-+                                                              |   
  2.44e+07 +-+                                                              |   
           |                                                                |   
  2.42e+07 +-+               O   O           O O O O        O        O      |   
   2.4e+07 +-+          O  O   O                        O O    O O O    O O |   
           |          O             O O O O           O                     |   
  2.38e+07 O-+    O                                                         |   
  2.36e+07 +-O O    O                                                       |   
           |                                                                |   
  2.34e+07 +-+--------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-4.20.0-rc7-00001-g8234f6734" of type "text/plain" (168504 bytes)

View attachment "job-script" of type "text/plain" (7129 bytes)

View attachment "job.yaml" of type "text/plain" (4682 bytes)

View attachment "reproduce" of type "text/plain" (310 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ