lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171204030240.GX21779@yexl-desktop>
Date:   Mon, 4 Dec 2017 11:02:40 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bpetkov@...e.de>,
        Brian Gerst <brgerst@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Rik van Riel <riel@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...org
Subject: [lkp-robot] [x86/entry/64]  63e02a2a32:
 will-it-scale.per_process_ops -13.0% regression


Greeting,

FYI, we noticed a -13.0% regression of will-it-scale.per_process_ops due to commit:


commit: 63e02a2a3292d8815eac7be438c8c73d72a7bb93 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: will-it-scale
on test machine: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory
with following parameters:

	test: poll1
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops -7.0% regression       |
| test machine     | 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory |
| test parameters  | cpufreq_governor=performance                                        |
|                  | test=writeseek1                                                     |
+------------------+---------------------------------------------------------------------+
| testcase: change | aim9: aim9.brk_test.ops_per_sec -9.9% regression                    |
| test machine     | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory    |
| test parameters  | cpufreq_governor=performance                                        |
|                  | test=brk_test                                                       |
|                  | testtime=300s                                                       |
+------------------+---------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/poll1/will-it-scale

commit: 
  955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack")
  63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")

955cef1517a1be93 63e02a2a3292d8815eac7be438 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   7435674           -13.0%    6465918        will-it-scale.per_process_ops
   5868564           -10.4%    5256868        will-it-scale.per_thread_ops
      0.56            +8.0%       0.61 ±  2%  will-it-scale.scalability
      1947            -2.0%       1908        will-it-scale.time.system_time
    562.79            +6.9%     601.69        will-it-scale.time.user_time
      8.06            +0.8        8.86 ±  3%  mpstat.cpu.usr%
      4969 ± 83%     -84.5%     769.00 ±  6%  numa-meminfo.node1.Inactive(anon)
    116.75 ± 63%     +90.1%     222.00 ±  9%  numa-vmstat.node0.nr_mlock
    116.75 ± 63%     +90.1%     222.00 ±  9%  numa-vmstat.node0.nr_unevictable
    116.75 ± 63%     +90.1%     222.00 ±  9%  numa-vmstat.node0.nr_zone_unevictable
      1242 ± 83%     -84.6%     191.25 ±  6%  numa-vmstat.node1.nr_inactive_anon
      1242 ± 83%     -84.6%     191.25 ±  6%  numa-vmstat.node1.nr_zone_inactive_anon
   1414780            +7.7%    1524182 ±  3%  sched_debug.cfs_rq:/.min_vruntime.max
    144.71 ± 12%     +17.8%     170.42 ±  2%  sched_debug.cfs_rq:/.runnable_load_avg.max
   -568616           -29.5%    -400842        sched_debug.cfs_rq:/.spread0.min
    202980 ± 13%     +56.8%     318219 ±  6%  sched_debug.cpu.avg_idle.min
    173545 ±  3%     -13.9%     149414 ±  5%  sched_debug.cpu.avg_idle.stddev
 2.906e+12            -7.9%  2.676e+12        perf-stat.branch-instructions
      0.01 ±  2%      +2.0        2.00        perf-stat.branch-miss-rate%
 2.405e+08        +22170.9%  5.356e+10        perf-stat.branch-misses
      1.15           +11.6%       1.28        perf-stat.cpi
 3.659e+12            -9.3%  3.318e+12        perf-stat.dTLB-loads
      0.00 ±  6%      +0.0        0.00 ±  3%  perf-stat.dTLB-store-miss-rate%
 2.869e+12            -8.8%  2.616e+12        perf-stat.dTLB-stores
 1.406e+13            -9.7%   1.27e+13        perf-stat.instructions
      0.87           -10.4%       0.78        perf-stat.ipc
     13.72 ±  2%     -13.7        0.00        perf-profile.calltrace.cycles.entry_SYSCALL_64
     24.53 ±  2%      -0.2       24.30 ±  3%  perf-profile.calltrace.cycles.copy_user_generic_string._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
     12.15 ±  3%      -0.2       11.98 ±  3%  perf-profile.calltrace.cycles.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
      9.57 ±  3%      -0.1        9.48 ±  4%  perf-profile.calltrace.cycles.__fget.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
      5.79 ±  6%      -0.0        5.75 ±  3%  perf-profile.calltrace.cycles.fput.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
     32.25 ±  2%      +1.5       33.78 ±  3%  perf-profile.calltrace.cycles._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
      3.99 ±  5%      +1.6        5.56 ±  3%  perf-profile.calltrace.cycles.__might_fault._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
     65.36 ±  2%      +2.0       67.34 ±  2%  perf-profile.calltrace.cycles.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
     68.87 ±  2%      +3.1       72.01 ±  2%  perf-profile.calltrace.cycles.sys_poll.entry_SYSCALL_64_fastpath
      7.33 ± 35%      +3.7       11.05 ± 23%  perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
     71.48 ±  2%      +3.9       75.41 ±  2%  perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath
      9.50 ± 25%      +4.0       13.49 ± 19%  perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.calltrace.cycles.secondary_startup_64
      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64
      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.calltrace.cycles.start_secondary.secondary_startup_64
      2.25 ±  3%      +5.4        7.67 ±  3%  perf-profile.calltrace.cycles.entry_SYSCALL_64_after_hwframe
     13.72 ±  2%     -13.7        0.00        perf-profile.children.cycles.entry_SYSCALL_64
     24.53 ±  2%      -0.2       24.31 ±  3%  perf-profile.children.cycles.copy_user_generic_string
     12.16 ±  3%      -0.2       11.99 ±  3%  perf-profile.children.cycles.__fget_light
      9.57 ±  3%      -0.1        9.48 ±  4%  perf-profile.children.cycles.__fget
      5.79 ±  6%      -0.0        5.75 ±  3%  perf-profile.children.cycles.fput
     32.25 ±  2%      +1.5       33.78 ±  3%  perf-profile.children.cycles._copy_from_user
      3.99 ±  5%      +1.6        5.56 ±  3%  perf-profile.children.cycles.__might_fault
     65.36 ±  2%      +2.0       67.34 ±  2%  perf-profile.children.cycles.do_sys_poll
     68.87 ±  2%      +3.1       72.01 ±  2%  perf-profile.children.cycles.sys_poll
      7.42 ± 34%      +3.7       11.14 ± 22%  perf-profile.children.cycles.poll_idle
     71.61 ±  2%      +3.9       75.50 ±  2%  perf-profile.children.cycles.entry_SYSCALL_64_fastpath
      9.88 ± 23%      +4.0       13.87 ± 19%  perf-profile.children.cycles.cpuidle_enter_state
     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.children.cycles.secondary_startup_64
     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.children.cycles.cpu_startup_entry
      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.children.cycles.start_secondary
     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.children.cycles.do_idle
      2.25 ±  3%      +5.4        7.67 ±  3%  perf-profile.children.cycles.entry_SYSCALL_64_after_hwframe
     13.72 ±  2%     -13.7        0.00        perf-profile.self.cycles.entry_SYSCALL_64
     24.21 ±  2%      -0.3       23.93 ±  2%  perf-profile.self.cycles.copy_user_generic_string
      9.47 ±  3%      -0.1        9.41 ±  4%  perf-profile.self.cycles.__fget
      5.69 ±  5%      +0.0        5.71 ±  3%  perf-profile.self.cycles.fput
     13.55 ±  4%      +0.7       14.24        perf-profile.self.cycles.do_sys_poll
      7.41 ± 34%      +3.7       11.07 ± 22%  perf-profile.self.cycles.poll_idle
      2.25 ±  3%      +5.4        7.67 ±  3%  perf-profile.self.cycles.entry_SYSCALL_64_after_hwframe


                                                                                
                             will-it-scale.per_process_ops                      
                                                                                
  7.8e+06 +-+---------------------------------------------------------------+   
          |. .+.++                              .++.                        |   
  7.6e+06 +-+     :                         .+.+    +.+.+.+        +.+      |   
          |       :                     .+.+               +      +   +     |   
  7.4e+06 +-+      +.+.+.+.++.+.+.+.+.++                    ++.+.+     ++.+.|   
          |                                                                 |   
  7.2e+06 +-+                                                               |   
          |                                                                 |   
    7e+06 +-+                                                               |   
          |                                                                 |   
  6.8e+06 +-+                                                               |   
          |                                                                 |   
  6.6e+06 O-+ O OO                    OO O O                                |   
          | O      O   O O OO O O O O            OO O O O O O               |   
  6.4e+06 +-+--------O-----------------------O-O-------------O--------------+   
                                                                                
                                                                                                                                                                
                                perf-stat.instructions                          
                                                                                
   1.5e+13 +-+--------------------------------------------------------------+   
           |                                                                |   
  1.45e+13 +-+  +.+                                               .+.       |   
           | +.+   +              +.+.+.+.    .+.+.+. +.   .+.++.+   +.     |   
           |        +.            :       +.++       +  +.+            ++.+.|   
   1.4e+13 +-+        +.++.+.+.+.+                                          |   
           |                                                                |   
  1.35e+13 +-+                                                              |   
           |                                                                |   
   1.3e+13 +-+                                                              |   
           O   OO O   O OO   O          O   O         O   O                 |   
           | O      O      O   O OO O O   O      O O O  O   O O             |   
  1.25e+13 +-+                               O O                            |   
           |                                                                |   
   1.2e+13 +-+--------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                             perf-stat.branch-instructions                      
                                                                                
  3.05e+12 +-+--------------------------------------------------------------+   
     3e+12 +-+                                                     +        |   
           |.+.++.+   +                     ++    .+.+ .+.        + +     + |   
  2.95e+12 +-+     + + + +.+. .+.   +   +. +  + .+    +   +   +  +   +   : +|   
   2.9e+12 +-+      +   +    +   + + + +  +    +           + + :+     +  :  |   
           |                      +   +                     +  +       ++   |   
  2.85e+12 +-+                                                              |   
   2.8e+12 +-+                                                              |   
  2.75e+12 +-+                                                              |   
           |    O                                                           |   
   2.7e+12 +-+    O   O                                 O   O O             |   
  2.65e+12 O-+ O    O   O    O   O  O   O   O  O O O O                      |   
           | O           O O   O  O   O   O           O   O                 |   
   2.6e+12 +-+                               O                              |   
  2.55e+12 +-+--------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                               perf-stat.branch-misses                          
                                                                                
  6e+10 +-+-----------------------------------------------------------------+   
        |     O O  O                            O      O   O O              |   
  5e+10 O-O O    O   O O O O O O OO O O O O O O   OO O   O                  |   
        |                                                                   |   
        |                                                                   |   
  4e+10 +-+                                                                 |   
        |                                                                   |   
  3e+10 +-+                                                                 |   
        |                                                                   |   
  2e+10 +-+                                                                 |   
        |                                                                   |   
        |                                                                   |   
  1e+10 +-+                                                                 |   
        |                                                                   |   
      0 +-+-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                 perf-stat.dTLB-stores                          
                                                                                
  3.2e+12 +-+---------------------------------------------------------------+   
          |                         +  +                     +   +          |   
  3.1e+12 +-+                      + + :                     :+ +:          |   
          |                       +   + :                   +  +  :         |   
    3e+12 +-+                     :     :                   :     :         |   
          |.                     :      :                  :       :      + |   
  2.9e+12 +-+.+.++.              :       :     +.+   .+.   :       +. .+ : +|   
          |        +.+. .+.++.+.:        +.   +   :.+   +.:          +  ::  |   
  2.8e+12 +-+          +        +          +.+    +       +             +   |   
          |                                                                 |   
  2.7e+12 +-+                                                               |   
          O     OO   O      O                         O O                   |   
  2.6e+12 +-O O    O   O O O           O   O     OO O     O OO              |   
          |                   O O O O O  O   O O                            |   
  2.5e+12 +-+---------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                            perf-stat.branch-miss-rate_                         
                                                                                
  2.5 +-+-------------------------------------------------------------------+   
      |                                                                     |   
      |                                                                     |   
    2 O-O O O O O O O O OO O O O O O O O O O O O O O O O O OO               |   
      |                                                                     |   
      |                                                                     |   
  1.5 +-+                                                                   |   
      |                                                                     |   
    1 +-+                                                                   |   
      |                                                                     |   
      |                                                                     |   
  0.5 +-+                                                                   |   
      |                                                                     |   
      |                                                                     |   
    0 +-+-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                   perf-stat.ipc                                
                                                                                
  0.92 +-+------------------------------------------------------------------+   
       |                                                                    |   
   0.9 +-+.+.                   +. .+.        .+.          +. .+.           |   
  0.88 +-+   +.                +  +   +.   +.+   +. .+.   +  +   + .+.      |   
       |       +.   +.       .+         +.+        +   +.+        +   +. .+.|   
  0.86 +-+       +.+  +.+.+.+                                           +   |   
       |                                                                    |   
  0.84 +-+                                                                  |   
       |                                                                    |   
  0.82 +-+                                                                  |   
   0.8 +-+                          O O O O                                 |   
       |           O  O                              O O                    |   
  0.78 +-O O O O O  O   O O   O   O            O O         O O              |   
       O                    O   O          O O     O     O                  |   
  0.76 +-+------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                   perf-stat.cpi                                
                                                                                
   1.3 +-+---------------------------------O-O------------------------------+   
  1.28 O-+          O       O O O O                O     O O O              |   
       | O O O O O O  O O O                    O O     O                    |   
  1.26 +-+                                           O                      |   
  1.24 +-+                          O O O O                                 |   
       |                                                                    |   
  1.22 +-+                                                                  |   
   1.2 +-+                                                                  |   
  1.18 +-+                                                                  |   
       |                                                                    |   
  1.16 +-+      .+.+ .+.+.+.+.           .+                            .+.  |   
  1.14 +-+    .+    +         +        .+  +.     .+. .+.+        +. .+   +.|   
       |.+. .+                 + .+. .+      +. .+   +    + .+. .+  +       |   
  1.12 +-+ +                    +   +          +           +   +            |   
   1.1 +-+------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                           will-it-scale.time.user_time                         
                                                                                
  620 +-+-------------------------------------------------------------------+   
  610 +-+       O O                                                         |   
      O O O O O     O O OO           O   O O O       O      O               |   
  600 +-+                  O O O O O   O       O O O   O O O                |   
  590 +-+                                                                   |   
      |                                                                     |   
  580 +-+                                                                   |   
  570 +-+                                                                   |   
  560 +-+                                                             +.+.+.|   
      |                                                               :     |   
  550 +-+.+.+.+.                        .+        .+.+.              :      |   
  540 +-+       +.+.                   +  +   .+.+     +.+        +. :      |   
      |             +.+.++.+.+.       +    +.+            +      +  +       |   
  530 +-+                      +.+.+.+                     ++.+.+           |   
  520 +-+-------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-sb03: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/writeseek1/will-it-scale

commit: 
  955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack")
  63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")

955cef1517a1be93 63e02a2a3292d8815eac7be438 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   1902014            -7.0%    1768039        will-it-scale.per_process_ops
   1557647            -6.3%    1459046        will-it-scale.per_thread_ops
      0.52            +4.0%       0.54        will-it-scale.scalability
      2293            -1.8%       2251        will-it-scale.time.system_time
    216.11           +19.7%     258.70        will-it-scale.time.user_time
 1.453e+08 ±  6%     +21.7%  1.769e+08 ±  9%  cpuidle.POLL.time
      3.43            +0.8        4.26        mpstat.cpu.usr%
    284863 ±  6%     +12.9%     321561 ±  3%  softirqs.RCU
      7178 ±  6%     -11.3%       6368        slabinfo.kmalloc-96.active_objs
      7218 ±  5%     -10.6%       6450        slabinfo.kmalloc-96.num_objs
     72.27 ±  6%     +19.5%      86.39 ±  7%  sched_debug.cfs_rq:/.load_avg.avg
    107.67 ±  3%     +31.1%     141.11 ± 19%  sched_debug.cfs_rq:/.load_avg.stddev
     50035 ± 23%     +17.3%      58672 ± 24%  sched_debug.cpu.load.stddev
      7.58 ± 21%     +65.4%      12.54 ± 11%  sched_debug.cpu.nr_uninterruptible.max
 3.143e+12            -4.7%  2.995e+12        perf-stat.branch-instructions
      0.01 ±  2%      +1.0        0.97        perf-stat.branch-miss-rate%
 3.791e+08 ±  3%   +7525.5%  2.891e+10        perf-stat.branch-misses
  2.54e+08            +1.0%  2.566e+08        perf-stat.cache-misses
      1.03            +6.3%       1.10        perf-stat.cpi
 6.671e+12            -4.7%  6.361e+12        perf-stat.dTLB-loads
 4.722e+12            -5.0%  4.485e+12        perf-stat.dTLB-stores
     35.63 ± 12%     -29.7        5.89 ± 20%  perf-stat.iTLB-load-miss-rate%
 8.119e+08 ±  8%    +829.8%  7.549e+09 ±  2%  perf-stat.iTLB-loads
 1.563e+13            -5.3%   1.48e+13        perf-stat.instructions
      0.97            -5.9%       0.91        perf-stat.ipc
      5.97            -6.0        0.00        perf-profile.calltrace.cycles.entry_SYSCALL_64
      7.43 ±  2%      -0.1        7.29 ±  3%  perf-profile.calltrace.cycles.find_lock_entry.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter
      9.10 ±  2%      -0.1        9.00 ±  3%  perf-profile.calltrace.cycles.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      9.43 ±  2%      -0.1        9.33 ±  3%  perf-profile.calltrace.cycles.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write
     19.45            -0.1       19.39 ±  2%  perf-profile.calltrace.cycles.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
     19.14            -0.0       19.10        perf-profile.calltrace.cycles.copy_user_generic_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
     21.14            +0.0       21.15 ±  2%  perf-profile.calltrace.cycles.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write
      9.16 ± 10%      +0.0        9.20 ± 41%  perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
     41.59            +0.1       41.71 ±  2%  perf-profile.calltrace.cycles.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write
     11.09 ±  8%      +0.2       11.24 ± 31%  perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64
     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.calltrace.cycles.start_secondary.secondary_startup_64
     11.68 ±  7%      +0.2       11.90 ± 27%  perf-profile.calltrace.cycles.secondary_startup_64
     45.10            +0.3       45.37 ±  2%  perf-profile.calltrace.cycles.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write.sys_write
     51.69            +0.3       52.02 ±  2%  perf-profile.calltrace.cycles.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
     50.28            +0.4       50.63 ±  2%  perf-profile.calltrace.cycles.generic_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
     61.80            +0.8       62.60 ±  3%  perf-profile.calltrace.cycles.vfs_write.sys_write.entry_SYSCALL_64_fastpath
      4.92            +0.9        5.80 ±  5%  perf-profile.calltrace.cycles.__fdget_pos.sys_lseek.entry_SYSCALL_64_fastpath
      4.96            +0.9        5.86 ±  3%  perf-profile.calltrace.cycles.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
      8.74            +1.0        9.75 ±  6%  perf-profile.calltrace.cycles.sys_lseek.entry_SYSCALL_64_fastpath
     69.88            +1.6       71.49 ±  3%  perf-profile.calltrace.cycles.sys_write.entry_SYSCALL_64_fastpath
     80.00            +2.9       82.90 ±  3%  perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath
      5.97            -6.0        0.00        perf-profile.children.cycles.entry_SYSCALL_64
      7.43 ±  2%      -0.1        7.29 ±  3%  perf-profile.children.cycles.find_lock_entry
      9.10 ±  2%      -0.1        9.00 ±  3%  perf-profile.children.cycles.shmem_getpage_gfp
      9.43 ±  2%      -0.1        9.33 ±  3%  perf-profile.children.cycles.shmem_write_begin
     19.45            -0.1       19.39 ±  2%  perf-profile.children.cycles.copyin
     19.14            -0.0       19.11        perf-profile.children.cycles.copy_user_generic_string
     21.14            +0.0       21.15 ±  2%  perf-profile.children.cycles.iov_iter_copy_from_user_atomic
      9.46 ±  9%      +0.1        9.56 ± 36%  perf-profile.children.cycles.poll_idle
     41.60            +0.1       41.72 ±  2%  perf-profile.children.cycles.generic_perform_write
     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.children.cycles.start_secondary
     11.56 ±  7%      +0.2       11.76 ± 27%  perf-profile.children.cycles.cpuidle_enter_state
     11.69 ±  7%      +0.2       11.90 ± 27%  perf-profile.children.cycles.do_idle
     11.68 ±  7%      +0.2       11.90 ± 27%  perf-profile.children.cycles.secondary_startup_64
     11.68 ±  7%      +0.2       11.90 ± 27%  perf-profile.children.cycles.cpu_startup_entry
     45.10            +0.3       45.37 ±  2%  perf-profile.children.cycles.__generic_file_write_iter
     51.72            +0.3       52.03 ±  2%  perf-profile.children.cycles.__vfs_write
     50.28            +0.4       50.63 ±  2%  perf-profile.children.cycles.generic_file_write_iter
     61.84            +0.8       62.62 ±  3%  perf-profile.children.cycles.vfs_write
      8.74            +1.0        9.75 ±  6%  perf-profile.children.cycles.sys_lseek
      3.81            +1.6        5.38 ±  5%  perf-profile.children.cycles.__fget_light
     69.93            +1.6       71.50 ±  3%  perf-profile.children.cycles.sys_write
      9.88            +1.8       11.67 ±  3%  perf-profile.children.cycles.__fdget_pos
     80.23            +2.7       82.94 ±  3%  perf-profile.children.cycles.entry_SYSCALL_64_fastpath
      5.97            -6.0        0.00        perf-profile.self.cycles.entry_SYSCALL_64
     18.93            -0.1       18.84 ±  2%  perf-profile.self.cycles.copy_user_generic_string
      9.39 ±  8%      +0.0        9.42 ± 35%  perf-profile.self.cycles.poll_idle



***************************************************************************************************
lkp-ivb-d03: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-ivb-d03/brk_test/aim9/300s

commit: 
  955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack")
  63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")

955cef1517a1be93 63e02a2a3292d8815eac7be438 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   4124214            -9.9%    3717599        aim9.brk_test.ops_per_sec
    272.29            -4.9%     259.03        aim9.time.system_time
     27.71           +47.2%      40.78        aim9.time.user_time
     12605 ±  9%     -27.0%       9203 ± 10%  cpuidle.POLL.usage
      3.24 ±  2%      +1.4        4.62        mpstat.cpu.usr%
      4007 ±  3%      -9.2%       3639 ±  4%  slabinfo.anon_vma_chain.num_objs
      9.80            -1.9%       9.61        turbostat.CorWatt
     30309            -1.3%      29929        vmstat.system.cs
     18905            -1.1%      18689        vmstat.system.in
    716.67 ± 11%     -22.7%     554.33 ±  6%  sched_debug.cfs_rq:/.load_avg.avg
      1.00 ± 11%     -79.2%       0.21 ±173%  sched_debug.cfs_rq:/.nr_spread_over.min
      0.45 ± 55%     +70.3%       0.76 ± 19%  sched_debug.cfs_rq:/.nr_spread_over.stddev
    521.82 ±  3%     -10.2%     468.57 ±  2%  sched_debug.cfs_rq:/.util_avg.avg
      1.96 ±  7%     +34.0%       2.62 ±  9%  sched_debug.cpu.nr_running.max
      0.68 ± 15%     +42.9%       0.98 ± 15%  sched_debug.cpu.nr_running.stddev
      0.06 ± 19%      +0.9        0.92        perf-stat.branch-miss-rate%
 3.583e+08 ±  5%   +1125.0%  4.389e+09 ± 28%  perf-stat.branch-misses
   9163065            -1.8%    8997254        perf-stat.context-switches
      0.56 ±  2%     +12.8%       0.63 ±  4%  perf-stat.cpi
      0.06 ±132%      +0.2        0.23 ±  6%  perf-stat.dTLB-load-miss-rate%
 4.062e+08 ±142%    +234.1%  1.357e+09 ±  8%  perf-stat.dTLB-load-misses
   9061724 ± 12%     +22.0%   11056158 ±  6%  perf-stat.dTLB-store-misses
     11.72 ± 24%      -6.6        5.08 ± 33%  perf-stat.iTLB-load-miss-rate%
   4.4e+08 ± 29%    +135.5%  1.036e+09 ± 23%  perf-stat.iTLB-loads
      1.80 ±  2%     -11.2%       1.60 ±  3%  perf-stat.ipc
     14.11 ± 88%      -2.6       11.50 ± 86%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
     12.86 ± 92%      -2.4       10.45 ± 97%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
     45.20 ±  3%      -1.4       43.82        perf-profile.calltrace.cycles-pp.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
     16.60 ±  3%      -0.9       15.74 ±  3%  perf-profile.calltrace.cycles-pp.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
     56.05 ±  2%      -0.8       55.25        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath
     14.60 ±  3%      -0.7       13.88 ±  2%  perf-profile.calltrace.cycles-pp.__vma_adjust.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
     54.84 ±  3%      -0.7       54.15        perf-profile.calltrace.cycles-pp.sys_brk.entry_SYSCALL_64_fastpath
     11.52 ±  9%      -0.1       11.46        perf-profile.calltrace.cycles-pp.perf_event_mmap.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
      6.30 ±  5%      +0.2        6.48 ±  3%  perf-profile.calltrace.cycles-pp.security_vm_enough_memory_mm.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
     27.40 ±  3%      +0.8       28.18 ±  4%  perf-profile.calltrace.cycles-pp.secondary_startup_64
     12.40 ± 94%      +3.3       15.73 ± 62%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel
     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64
     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64
     13.14 ± 88%      +3.4       16.53 ± 57%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.children.cycles-pp.start_secondary
     45.83 ±  3%      -1.2       44.59        perf-profile.children.cycles-pp.do_brk_flags
     56.30 ±  2%      -0.9       55.36        perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath
     17.05 ±  3%      -0.8       16.24 ±  3%  perf-profile.children.cycles-pp.vma_merge
     15.45 ±  3%      -0.7       14.79 ±  2%  perf-profile.children.cycles-pp.__vma_adjust
     55.47 ±  3%      -0.6       54.88        perf-profile.children.cycles-pp.sys_brk
     12.21 ±  8%      -0.1       12.08        perf-profile.children.cycles-pp.perf_event_mmap
      6.40 ±  5%      +0.2        6.57 ±  3%  perf-profile.children.cycles-pp.security_vm_enough_memory_mm
     27.41 ±  3%      +0.8       28.19 ±  4%  perf-profile.children.cycles-pp.do_idle
     27.30 ±  3%      +0.8       28.07 ±  4%  perf-profile.children.cycles-pp.cpuidle_enter_state
     27.40 ±  3%      +0.8       28.18 ±  4%  perf-profile.children.cycles-pp.secondary_startup_64
     27.40 ±  3%      +0.8       28.18 ±  4%  perf-profile.children.cycles-pp.cpu_startup_entry
     25.27            +0.9       26.19        perf-profile.children.cycles-pp.intel_idle
     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.children.cycles-pp.start_kernel
      4.82 ±  9%      +0.0        4.83 ±  5%  perf-profile.self.cycles-pp.__vma_adjust
      5.25 ±  9%      +0.0        5.29 ±  2%  perf-profile.self.cycles-pp.perf_event_mmap
      5.33 ±  3%      +0.4        5.75 ±  3%  perf-profile.self.cycles-pp.do_brk_flags
     25.26            +0.9       26.19        perf-profile.self.cycles-pp.intel_idle



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.14.0-01234-g63e02a2" of type "text/plain" (163502 bytes)

View attachment "job.yaml" of type "text/plain" (4755 bytes)

View attachment "reproduce" of type "text/plain" (327 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ