lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DCCDFE6A-5A11-45FF-83C0-C441AD645F48@amacapital.net>
Date:   Sun, 3 Dec 2017 19:59:56 -0800
From:   Andy Lutomirski <luto@...capital.net>
To:     kernel test robot <xiaolong.ye@...el.com>
Cc:     Andy Lutomirski <luto@...nel.org>, Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Borislav Petkov <bpetkov@...e.de>,
        Brian Gerst <brgerst@...il.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Denys Vlasenko <dvlasenk@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Rik van Riel <riel@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>, lkp@...org
Subject: Re: [lkp-robot] [x86/entry/64]  63e02a2a32: will-it-scale.per_process_ops -13.0% regression

Thomas, has my fix for this landed?

--Andy

> On Dec 3, 2017, at 7:02 PM, kernel test robot <xiaolong.ye@...el.com> wrote:
> 
> 
> Greeting,
> 
> FYI, we noticed a -13.0% regression of will-it-scale.per_process_ops due to commit:
> 
> 
> commit: 63e02a2a3292d8815eac7be438c8c73d72a7bb93 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> in testcase: will-it-scale
> on test machine: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory
> with following parameters:
> 
>    test: poll1
>    cpufreq_governor: performance
> 
> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
> test-url: https://github.com/antonblanchard/will-it-scale
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+---------------------------------------------------------------------+
> | testcase: change | will-it-scale: will-it-scale.per_process_ops -7.0% regression       |
> | test machine     | 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory |
> | test parameters  | cpufreq_governor=performance                                        |
> |                  | test=writeseek1                                                     |
> +------------------+---------------------------------------------------------------------+
> | testcase: change | aim9: aim9.brk_test.ops_per_sec -9.9% regression                    |
> | test machine     | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory    |
> | test parameters  | cpufreq_governor=performance                                        |
> |                  | test=brk_test                                                       |
> |                  | testtime=300s                                                       |
> +------------------+---------------------------------------------------------------------+
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>        git clone https://github.com/intel/lkp-tests.git
>        cd lkp-tests
>        bin/lkp install job.yaml  # job file is attached in this email
>        bin/lkp run     job.yaml
> 
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/poll1/will-it-scale
> 
> commit: 
>  955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack")
>  63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")
> 
> 955cef1517a1be93 63e02a2a3292d8815eac7be438 
> ---------------- -------------------------- 
>         %stddev     %change         %stddev
>             \          |                \  
>   7435674           -13.0%    6465918        will-it-scale.per_process_ops
>   5868564           -10.4%    5256868        will-it-scale.per_thread_ops
>      0.56            +8.0%       0.61 ±  2%  will-it-scale.scalability
>      1947            -2.0%       1908        will-it-scale.time.system_time
>    562.79            +6.9%     601.69        will-it-scale.time.user_time
>      8.06            +0.8        8.86 ±  3%  mpstat.cpu.usr%
>      4969 ± 83%     -84.5%     769.00 ±  6%  numa-meminfo.node1.Inactive(anon)
>    116.75 ± 63%     +90.1%     222.00 ±  9%  numa-vmstat.node0.nr_mlock
>    116.75 ± 63%     +90.1%     222.00 ±  9%  numa-vmstat.node0.nr_unevictable
>    116.75 ± 63%     +90.1%     222.00 ±  9%  numa-vmstat.node0.nr_zone_unevictable
>      1242 ± 83%     -84.6%     191.25 ±  6%  numa-vmstat.node1.nr_inactive_anon
>      1242 ± 83%     -84.6%     191.25 ±  6%  numa-vmstat.node1.nr_zone_inactive_anon
>   1414780            +7.7%    1524182 ±  3%  sched_debug.cfs_rq:/.min_vruntime.max
>    144.71 ± 12%     +17.8%     170.42 ±  2%  sched_debug.cfs_rq:/.runnable_load_avg.max
>   -568616           -29.5%    -400842        sched_debug.cfs_rq:/.spread0.min
>    202980 ± 13%     +56.8%     318219 ±  6%  sched_debug.cpu.avg_idle.min
>    173545 ±  3%     -13.9%     149414 ±  5%  sched_debug.cpu.avg_idle.stddev
> 2.906e+12            -7.9%  2.676e+12        perf-stat.branch-instructions
>      0.01 ±  2%      +2.0        2.00        perf-stat.branch-miss-rate%
> 2.405e+08        +22170.9%  5.356e+10        perf-stat.branch-misses
>      1.15           +11.6%       1.28        perf-stat.cpi
> 3.659e+12            -9.3%  3.318e+12        perf-stat.dTLB-loads
>      0.00 ±  6%      +0.0        0.00 ±  3%  perf-stat.dTLB-store-miss-rate%
> 2.869e+12            -8.8%  2.616e+12        perf-stat.dTLB-stores
> 1.406e+13            -9.7%   1.27e+13        perf-stat.instructions
>      0.87           -10.4%       0.78        perf-stat.ipc
>     13.72 ±  2%     -13.7        0.00        perf-profile.calltrace.cycles.entry_SYSCALL_64
>     24.53 ±  2%      -0.2       24.30 ±  3%  perf-profile.calltrace.cycles.copy_user_generic_string._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
>     12.15 ±  3%      -0.2       11.98 ±  3%  perf-profile.calltrace.cycles.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
>      9.57 ±  3%      -0.1        9.48 ±  4%  perf-profile.calltrace.cycles.__fget.__fget_light.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
>      5.79 ±  6%      -0.0        5.75 ±  3%  perf-profile.calltrace.cycles.fput.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
>     32.25 ±  2%      +1.5       33.78 ±  3%  perf-profile.calltrace.cycles._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
>      3.99 ±  5%      +1.6        5.56 ±  3%  perf-profile.calltrace.cycles.__might_fault._copy_from_user.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
>     65.36 ±  2%      +2.0       67.34 ±  2%  perf-profile.calltrace.cycles.do_sys_poll.sys_poll.entry_SYSCALL_64_fastpath
>     68.87 ±  2%      +3.1       72.01 ±  2%  perf-profile.calltrace.cycles.sys_poll.entry_SYSCALL_64_fastpath
>      7.33 ± 35%      +3.7       11.05 ± 23%  perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
>     71.48 ±  2%      +3.9       75.41 ±  2%  perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath
>      9.50 ± 25%      +4.0       13.49 ± 19%  perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.calltrace.cycles.secondary_startup_64
>      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64
>      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.calltrace.cycles.start_secondary.secondary_startup_64
>      2.25 ±  3%      +5.4        7.67 ±  3%  perf-profile.calltrace.cycles.entry_SYSCALL_64_after_hwframe
>     13.72 ±  2%     -13.7        0.00        perf-profile.children.cycles.entry_SYSCALL_64
>     24.53 ±  2%      -0.2       24.31 ±  3%  perf-profile.children.cycles.copy_user_generic_string
>     12.16 ±  3%      -0.2       11.99 ±  3%  perf-profile.children.cycles.__fget_light
>      9.57 ±  3%      -0.1        9.48 ±  4%  perf-profile.children.cycles.__fget
>      5.79 ±  6%      -0.0        5.75 ±  3%  perf-profile.children.cycles.fput
>     32.25 ±  2%      +1.5       33.78 ±  3%  perf-profile.children.cycles._copy_from_user
>      3.99 ±  5%      +1.6        5.56 ±  3%  perf-profile.children.cycles.__might_fault
>     65.36 ±  2%      +2.0       67.34 ±  2%  perf-profile.children.cycles.do_sys_poll
>     68.87 ±  2%      +3.1       72.01 ±  2%  perf-profile.children.cycles.sys_poll
>      7.42 ± 34%      +3.7       11.14 ± 22%  perf-profile.children.cycles.poll_idle
>     71.61 ±  2%      +3.9       75.50 ±  2%  perf-profile.children.cycles.entry_SYSCALL_64_fastpath
>      9.88 ± 23%      +4.0       13.87 ± 19%  perf-profile.children.cycles.cpuidle_enter_state
>     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.children.cycles.secondary_startup_64
>     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.children.cycles.cpu_startup_entry
>      9.66 ± 24%      +4.0       13.66 ± 19%  perf-profile.children.cycles.start_secondary
>     10.06 ± 23%      +4.0       14.05 ± 18%  perf-profile.children.cycles.do_idle
>      2.25 ±  3%      +5.4        7.67 ±  3%  perf-profile.children.cycles.entry_SYSCALL_64_after_hwframe
>     13.72 ±  2%     -13.7        0.00        perf-profile.self.cycles.entry_SYSCALL_64
>     24.21 ±  2%      -0.3       23.93 ±  2%  perf-profile.self.cycles.copy_user_generic_string
>      9.47 ±  3%      -0.1        9.41 ±  4%  perf-profile.self.cycles.__fget
>      5.69 ±  5%      +0.0        5.71 ±  3%  perf-profile.self.cycles.fput
>     13.55 ±  4%      +0.7       14.24        perf-profile.self.cycles.do_sys_poll
>      7.41 ± 34%      +3.7       11.07 ± 22%  perf-profile.self.cycles.poll_idle
>      2.25 ±  3%      +5.4        7.67 ±  3%  perf-profile.self.cycles.entry_SYSCALL_64_after_hwframe
> 
> 
> 
>                             will-it-scale.per_process_ops                      
> 
>  7.8e+06 +-+---------------------------------------------------------------+   
>          |. .+.++                              .++.                        |   
>  7.6e+06 +-+     :                         .+.+    +.+.+.+        +.+      |   
>          |       :                     .+.+               +      +   +     |   
>  7.4e+06 +-+      +.+.+.+.++.+.+.+.+.++                    ++.+.+     ++.+.|   
>          |                                                                 |   
>  7.2e+06 +-+                                                               |   
>          |                                                                 |   
>    7e+06 +-+                                                               |   
>          |                                                                 |   
>  6.8e+06 +-+                                                               |   
>          |                                                                 |   
>  6.6e+06 O-+ O OO                    OO O O                                |   
>          | O      O   O O OO O O O O            OO O O O O O               |   
>  6.4e+06 +-+--------O-----------------------O-O-------------O--------------+   
> 
> 
>                                perf-stat.instructions                          
> 
>   1.5e+13 +-+--------------------------------------------------------------+   
>           |                                                                |   
>  1.45e+13 +-+  +.+                                               .+.       |   
>           | +.+   +              +.+.+.+.    .+.+.+. +.   .+.++.+   +.     |   
>           |        +.            :       +.++       +  +.+            ++.+.|   
>   1.4e+13 +-+        +.++.+.+.+.+                                          |   
>           |                                                                |   
>  1.35e+13 +-+                                                              |   
>           |                                                                |   
>   1.3e+13 +-+                                                              |   
>           O   OO O   O OO   O          O   O         O   O                 |   
>           | O      O      O   O OO O O   O      O O O  O   O O             |   
>  1.25e+13 +-+                               O O                            |   
>           |                                                                |   
>   1.2e+13 +-+--------------------------------------------------------------+   
> 
> 
>                             perf-stat.branch-instructions                      
> 
>  3.05e+12 +-+--------------------------------------------------------------+   
>     3e+12 +-+                                                     +        |   
>           |.+.++.+   +                     ++    .+.+ .+.        + +     + |   
>  2.95e+12 +-+     + + + +.+. .+.   +   +. +  + .+    +   +   +  +   +   : +|   
>   2.9e+12 +-+      +   +    +   + + + +  +    +           + + :+     +  :  |   
>           |                      +   +                     +  +       ++   |   
>  2.85e+12 +-+                                                              |   
>   2.8e+12 +-+                                                              |   
>  2.75e+12 +-+                                                              |   
>           |    O                                                           |   
>   2.7e+12 +-+    O   O                                 O   O O             |   
>  2.65e+12 O-+ O    O   O    O   O  O   O   O  O O O O                      |   
>           | O           O O   O  O   O   O           O   O                 |   
>   2.6e+12 +-+                               O                              |   
>  2.55e+12 +-+--------------------------------------------------------------+   
> 
> 
>                               perf-stat.branch-misses                          
> 
>  6e+10 +-+-----------------------------------------------------------------+   
>        |     O O  O                            O      O   O O              |   
>  5e+10 O-O O    O   O O O O O O OO O O O O O O   OO O   O                  |   
>        |                                                                   |   
>        |                                                                   |   
>  4e+10 +-+                                                                 |   
>        |                                                                   |   
>  3e+10 +-+                                                                 |   
>        |                                                                   |   
>  2e+10 +-+                                                                 |   
>        |                                                                   |   
>        |                                                                   |   
>  1e+10 +-+                                                                 |   
>        |                                                                   |   
>      0 +-+-----------------------------------------------------------------+   
> 
> 
>                                 perf-stat.dTLB-stores                          
> 
>  3.2e+12 +-+---------------------------------------------------------------+   
>          |                         +  +                     +   +          |   
>  3.1e+12 +-+                      + + :                     :+ +:          |   
>          |                       +   + :                   +  +  :         |   
>    3e+12 +-+                     :     :                   :     :         |   
>          |.                     :      :                  :       :      + |   
>  2.9e+12 +-+.+.++.              :       :     +.+   .+.   :       +. .+ : +|   
>          |        +.+. .+.++.+.:        +.   +   :.+   +.:          +  ::  |   
>  2.8e+12 +-+          +        +          +.+    +       +             +   |   
>          |                                                                 |   
>  2.7e+12 +-+                                                               |   
>          O     OO   O      O                         O O                   |   
>  2.6e+12 +-O O    O   O O O           O   O     OO O     O OO              |   
>          |                   O O O O O  O   O O                            |   
>  2.5e+12 +-+---------------------------------------------------------------+   
> 
> 
>                            perf-stat.branch-miss-rate_                         
> 
>  2.5 +-+-------------------------------------------------------------------+   
>      |                                                                     |   
>      |                                                                     |   
>    2 O-O O O O O O O O OO O O O O O O O O O O O O O O O O OO               |   
>      |                                                                     |   
>      |                                                                     |   
>  1.5 +-+                                                                   |   
>      |                                                                     |   
>    1 +-+                                                                   |   
>      |                                                                     |   
>      |                                                                     |   
>  0.5 +-+                                                                   |   
>      |                                                                     |   
>      |                                                                     |   
>    0 +-+-------------------------------------------------------------------+   
> 
> 
>                                   perf-stat.ipc                                
> 
>  0.92 +-+------------------------------------------------------------------+   
>       |                                                                    |   
>   0.9 +-+.+.                   +. .+.        .+.          +. .+.           |   
>  0.88 +-+   +.                +  +   +.   +.+   +. .+.   +  +   + .+.      |   
>       |       +.   +.       .+         +.+        +   +.+        +   +. .+.|   
>  0.86 +-+       +.+  +.+.+.+                                           +   |   
>       |                                                                    |   
>  0.84 +-+                                                                  |   
>       |                                                                    |   
>  0.82 +-+                                                                  |   
>   0.8 +-+                          O O O O                                 |   
>       |           O  O                              O O                    |   
>  0.78 +-O O O O O  O   O O   O   O            O O         O O              |   
>       O                    O   O          O O     O     O                  |   
>  0.76 +-+------------------------------------------------------------------+   
> 
> 
>                                   perf-stat.cpi                                
> 
>   1.3 +-+---------------------------------O-O------------------------------+   
>  1.28 O-+          O       O O O O                O     O O O              |   
>       | O O O O O O  O O O                    O O     O                    |   
>  1.26 +-+                                           O                      |   
>  1.24 +-+                          O O O O                                 |   
>       |                                                                    |   
>  1.22 +-+                                                                  |   
>   1.2 +-+                                                                  |   
>  1.18 +-+                                                                  |   
>       |                                                                    |   
>  1.16 +-+      .+.+ .+.+.+.+.           .+                            .+.  |   
>  1.14 +-+    .+    +         +        .+  +.     .+. .+.+        +. .+   +.|   
>       |.+. .+                 + .+. .+      +. .+   +    + .+. .+  +       |   
>  1.12 +-+ +                    +   +          +           +   +            |   
>   1.1 +-+------------------------------------------------------------------+   
> 
> 
>                           will-it-scale.time.user_time                         
> 
>  620 +-+-------------------------------------------------------------------+   
>  610 +-+       O O                                                         |   
>      O O O O O     O O OO           O   O O O       O      O               |   
>  600 +-+                  O O O O O   O       O O O   O O O                |   
>  590 +-+                                                                   |   
>      |                                                                     |   
>  580 +-+                                                                   |   
>  570 +-+                                                                   |   
>  560 +-+                                                             +.+.+.|   
>      |                                                               :     |   
>  550 +-+.+.+.+.                        .+        .+.+.              :      |   
>  540 +-+       +.+.                   +  +   .+.+     +.+        +. :      |   
>      |             +.+.++.+.+.       +    +.+            +      +  +       |   
>  530 +-+                      +.+.+.+                     ++.+.+           |   
>  520 +-+-------------------------------------------------------------------+   
> 
> 
> [*] bisect-good sample
> [O] bisect-bad  sample
> 
> ***************************************************************************************************
> lkp-sb03: 32 threads Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz with 64G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase:
>  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-sb03/writeseek1/will-it-scale
> 
> commit: 
>  955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack")
>  63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")
> 
> 955cef1517a1be93 63e02a2a3292d8815eac7be438 
> ---------------- -------------------------- 
>         %stddev     %change         %stddev
>             \          |                \  
>   1902014            -7.0%    1768039        will-it-scale.per_process_ops
>   1557647            -6.3%    1459046        will-it-scale.per_thread_ops
>      0.52            +4.0%       0.54        will-it-scale.scalability
>      2293            -1.8%       2251        will-it-scale.time.system_time
>    216.11           +19.7%     258.70        will-it-scale.time.user_time
> 1.453e+08 ±  6%     +21.7%  1.769e+08 ±  9%  cpuidle.POLL.time
>      3.43            +0.8        4.26        mpstat.cpu.usr%
>    284863 ±  6%     +12.9%     321561 ±  3%  softirqs.RCU
>      7178 ±  6%     -11.3%       6368        slabinfo.kmalloc-96.active_objs
>      7218 ±  5%     -10.6%       6450        slabinfo.kmalloc-96.num_objs
>     72.27 ±  6%     +19.5%      86.39 ±  7%  sched_debug.cfs_rq:/.load_avg.avg
>    107.67 ±  3%     +31.1%     141.11 ± 19%  sched_debug.cfs_rq:/.load_avg.stddev
>     50035 ± 23%     +17.3%      58672 ± 24%  sched_debug.cpu.load.stddev
>      7.58 ± 21%     +65.4%      12.54 ± 11%  sched_debug.cpu.nr_uninterruptible.max
> 3.143e+12            -4.7%  2.995e+12        perf-stat.branch-instructions
>      0.01 ±  2%      +1.0        0.97        perf-stat.branch-miss-rate%
> 3.791e+08 ±  3%   +7525.5%  2.891e+10        perf-stat.branch-misses
>  2.54e+08            +1.0%  2.566e+08        perf-stat.cache-misses
>      1.03            +6.3%       1.10        perf-stat.cpi
> 6.671e+12            -4.7%  6.361e+12        perf-stat.dTLB-loads
> 4.722e+12            -5.0%  4.485e+12        perf-stat.dTLB-stores
>     35.63 ± 12%     -29.7        5.89 ± 20%  perf-stat.iTLB-load-miss-rate%
> 8.119e+08 ±  8%    +829.8%  7.549e+09 ±  2%  perf-stat.iTLB-loads
> 1.563e+13            -5.3%   1.48e+13        perf-stat.instructions
>      0.97            -5.9%       0.91        perf-stat.ipc
>      5.97            -6.0        0.00        perf-profile.calltrace.cycles.entry_SYSCALL_64
>      7.43 ±  2%      -0.1        7.29 ±  3%  perf-profile.calltrace.cycles.find_lock_entry.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter
>      9.10 ±  2%      -0.1        9.00 ±  3%  perf-profile.calltrace.cycles.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
>      9.43 ±  2%      -0.1        9.33 ±  3%  perf-profile.calltrace.cycles.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write
>     19.45            -0.1       19.39 ±  2%  perf-profile.calltrace.cycles.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
>     19.14            -0.0       19.10        perf-profile.calltrace.cycles.copy_user_generic_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
>     21.14            +0.0       21.15 ±  2%  perf-profile.calltrace.cycles.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write
>      9.16 ± 10%      +0.0        9.20 ± 41%  perf-profile.calltrace.cycles.poll_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
>     41.59            +0.1       41.71 ±  2%  perf-profile.calltrace.cycles.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write
>     11.09 ±  8%      +0.2       11.24 ± 31%  perf-profile.calltrace.cycles.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.calltrace.cycles.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.calltrace.cycles.cpu_startup_entry.start_secondary.secondary_startup_64
>     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.calltrace.cycles.start_secondary.secondary_startup_64
>     11.68 ±  7%      +0.2       11.90 ± 27%  perf-profile.calltrace.cycles.secondary_startup_64
>     45.10            +0.3       45.37 ±  2%  perf-profile.calltrace.cycles.__generic_file_write_iter.generic_file_write_iter.__vfs_write.vfs_write.sys_write
>     51.69            +0.3       52.02 ±  2%  perf-profile.calltrace.cycles.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>     50.28            +0.4       50.63 ±  2%  perf-profile.calltrace.cycles.generic_file_write_iter.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>     61.80            +0.8       62.60 ±  3%  perf-profile.calltrace.cycles.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>      4.92            +0.9        5.80 ±  5%  perf-profile.calltrace.cycles.__fdget_pos.sys_lseek.entry_SYSCALL_64_fastpath
>      4.96            +0.9        5.86 ±  3%  perf-profile.calltrace.cycles.__fdget_pos.sys_write.entry_SYSCALL_64_fastpath
>      8.74            +1.0        9.75 ±  6%  perf-profile.calltrace.cycles.sys_lseek.entry_SYSCALL_64_fastpath
>     69.88            +1.6       71.49 ±  3%  perf-profile.calltrace.cycles.sys_write.entry_SYSCALL_64_fastpath
>     80.00            +2.9       82.90 ±  3%  perf-profile.calltrace.cycles.entry_SYSCALL_64_fastpath
>      5.97            -6.0        0.00        perf-profile.children.cycles.entry_SYSCALL_64
>      7.43 ±  2%      -0.1        7.29 ±  3%  perf-profile.children.cycles.find_lock_entry
>      9.10 ±  2%      -0.1        9.00 ±  3%  perf-profile.children.cycles.shmem_getpage_gfp
>      9.43 ±  2%      -0.1        9.33 ±  3%  perf-profile.children.cycles.shmem_write_begin
>     19.45            -0.1       19.39 ±  2%  perf-profile.children.cycles.copyin
>     19.14            -0.0       19.11        perf-profile.children.cycles.copy_user_generic_string
>     21.14            +0.0       21.15 ±  2%  perf-profile.children.cycles.iov_iter_copy_from_user_atomic
>      9.46 ±  9%      +0.1        9.56 ± 36%  perf-profile.children.cycles.poll_idle
>     41.60            +0.1       41.72 ±  2%  perf-profile.children.cycles.generic_perform_write
>     11.21 ±  8%      +0.2       11.37 ± 31%  perf-profile.children.cycles.start_secondary
>     11.56 ±  7%      +0.2       11.76 ± 27%  perf-profile.children.cycles.cpuidle_enter_state
>     11.69 ±  7%      +0.2       11.90 ± 27%  perf-profile.children.cycles.do_idle
>     11.68 ±  7%      +0.2       11.90 ± 27%  perf-profile.children.cycles.secondary_startup_64
>     11.68 ±  7%      +0.2       11.90 ± 27%  perf-profile.children.cycles.cpu_startup_entry
>     45.10            +0.3       45.37 ±  2%  perf-profile.children.cycles.__generic_file_write_iter
>     51.72            +0.3       52.03 ±  2%  perf-profile.children.cycles.__vfs_write
>     50.28            +0.4       50.63 ±  2%  perf-profile.children.cycles.generic_file_write_iter
>     61.84            +0.8       62.62 ±  3%  perf-profile.children.cycles.vfs_write
>      8.74            +1.0        9.75 ±  6%  perf-profile.children.cycles.sys_lseek
>      3.81            +1.6        5.38 ±  5%  perf-profile.children.cycles.__fget_light
>     69.93            +1.6       71.50 ±  3%  perf-profile.children.cycles.sys_write
>      9.88            +1.8       11.67 ±  3%  perf-profile.children.cycles.__fdget_pos
>     80.23            +2.7       82.94 ±  3%  perf-profile.children.cycles.entry_SYSCALL_64_fastpath
>      5.97            -6.0        0.00        perf-profile.self.cycles.entry_SYSCALL_64
>     18.93            -0.1       18.84 ±  2%  perf-profile.self.cycles.copy_user_generic_string
>      9.39 ±  8%      +0.0        9.42 ± 35%  perf-profile.self.cycles.poll_idle
> 
> 
> 
> ***************************************************************************************************
> lkp-ivb-d03: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory
> =========================================================================================
> compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
>  gcc-7/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-ivb-d03/brk_test/aim9/300s
> 
> commit: 
>  955cef1517 ("x86/entry/64: Return to userspace from the trampoline stack")
>  63e02a2a32 ("x86/entry/64: Create a per-CPU SYSCALL entry trampoline")
> 
> 955cef1517a1be93 63e02a2a3292d8815eac7be438 
> ---------------- -------------------------- 
>         %stddev     %change         %stddev
>             \          |                \  
>   4124214            -9.9%    3717599        aim9.brk_test.ops_per_sec
>    272.29            -4.9%     259.03        aim9.time.system_time
>     27.71           +47.2%      40.78        aim9.time.user_time
>     12605 ±  9%     -27.0%       9203 ± 10%  cpuidle.POLL.usage
>      3.24 ±  2%      +1.4        4.62        mpstat.cpu.usr%
>      4007 ±  3%      -9.2%       3639 ±  4%  slabinfo.anon_vma_chain.num_objs
>      9.80            -1.9%       9.61        turbostat.CorWatt
>     30309            -1.3%      29929        vmstat.system.cs
>     18905            -1.1%      18689        vmstat.system.in
>    716.67 ± 11%     -22.7%     554.33 ±  6%  sched_debug.cfs_rq:/.load_avg.avg
>      1.00 ± 11%     -79.2%       0.21 ±173%  sched_debug.cfs_rq:/.nr_spread_over.min
>      0.45 ± 55%     +70.3%       0.76 ± 19%  sched_debug.cfs_rq:/.nr_spread_over.stddev
>    521.82 ±  3%     -10.2%     468.57 ±  2%  sched_debug.cfs_rq:/.util_avg.avg
>      1.96 ±  7%     +34.0%       2.62 ±  9%  sched_debug.cpu.nr_running.max
>      0.68 ± 15%     +42.9%       0.98 ± 15%  sched_debug.cpu.nr_running.stddev
>      0.06 ± 19%      +0.9        0.92        perf-stat.branch-miss-rate%
> 3.583e+08 ±  5%   +1125.0%  4.389e+09 ± 28%  perf-stat.branch-misses
>   9163065            -1.8%    8997254        perf-stat.context-switches
>      0.56 ±  2%     +12.8%       0.63 ±  4%  perf-stat.cpi
>      0.06 ±132%      +0.2        0.23 ±  6%  perf-stat.dTLB-load-miss-rate%
> 4.062e+08 ±142%    +234.1%  1.357e+09 ±  8%  perf-stat.dTLB-load-misses
>   9061724 ± 12%     +22.0%   11056158 ±  6%  perf-stat.dTLB-store-misses
>     11.72 ± 24%      -6.6        5.08 ± 33%  perf-stat.iTLB-load-miss-rate%
>   4.4e+08 ± 29%    +135.5%  1.036e+09 ± 23%  perf-stat.iTLB-loads
>      1.80 ±  2%     -11.2%       1.60 ±  3%  perf-stat.ipc
>     14.11 ± 88%      -2.6       11.50 ± 86%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
>     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
>     12.86 ± 92%      -2.4       10.45 ± 97%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
>     45.20 ±  3%      -1.4       43.82        perf-profile.calltrace.cycles-pp.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
>     16.60 ±  3%      -0.9       15.74 ±  3%  perf-profile.calltrace.cycles-pp.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
>     56.05 ±  2%      -0.8       55.25        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath
>     14.60 ±  3%      -0.7       13.88 ±  2%  perf-profile.calltrace.cycles-pp.__vma_adjust.vma_merge.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
>     54.84 ±  3%      -0.7       54.15        perf-profile.calltrace.cycles-pp.sys_brk.entry_SYSCALL_64_fastpath
>     11.52 ±  9%      -0.1       11.46        perf-profile.calltrace.cycles-pp.perf_event_mmap.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
>      6.30 ±  5%      +0.2        6.48 ±  3%  perf-profile.calltrace.cycles-pp.security_vm_enough_memory_mm.do_brk_flags.sys_brk.entry_SYSCALL_64_fastpath
>     27.40 ±  3%      +0.8       28.18 ±  4%  perf-profile.calltrace.cycles-pp.secondary_startup_64
>     12.40 ± 94%      +3.3       15.73 ± 62%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel
>     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
>     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_kernel.secondary_startup_64
>     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.calltrace.cycles-pp.start_kernel.secondary_startup_64
>     13.14 ± 88%      +3.4       16.53 ± 57%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_kernel.secondary_startup_64
>     14.22 ± 88%      -2.6       11.63 ± 85%  perf-profile.children.cycles-pp.start_secondary
>     45.83 ±  3%      -1.2       44.59        perf-profile.children.cycles-pp.do_brk_flags
>     56.30 ±  2%      -0.9       55.36        perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath
>     17.05 ±  3%      -0.8       16.24 ±  3%  perf-profile.children.cycles-pp.vma_merge
>     15.45 ±  3%      -0.7       14.79 ±  2%  perf-profile.children.cycles-pp.__vma_adjust
>     55.47 ±  3%      -0.6       54.88        perf-profile.children.cycles-pp.sys_brk
>     12.21 ±  8%      -0.1       12.08        perf-profile.children.cycles-pp.perf_event_mmap
>      6.40 ±  5%      +0.2        6.57 ±  3%  perf-profile.children.cycles-pp.security_vm_enough_memory_mm
>     27.41 ±  3%      +0.8       28.19 ±  4%  perf-profile.children.cycles-pp.do_idle
>     27.30 ±  3%      +0.8       28.07 ±  4%  perf-profile.children.cycles-pp.cpuidle_enter_state
>     27.40 ±  3%      +0.8       28.18 ±  4%  perf-profile.children.cycles-pp.secondary_startup_64
>     27.40 ±  3%      +0.8       28.18 ±  4%  perf-profile.children.cycles-pp.cpu_startup_entry
>     25.27            +0.9       26.19        perf-profile.children.cycles-pp.intel_idle
>     13.18 ± 88%      +3.4       16.55 ± 57%  perf-profile.children.cycles-pp.start_kernel
>      4.82 ±  9%      +0.0        4.83 ±  5%  perf-profile.self.cycles-pp.__vma_adjust
>      5.25 ±  9%      +0.0        5.29 ±  2%  perf-profile.self.cycles-pp.perf_event_mmap
>      5.33 ±  3%      +0.4        5.75 ±  3%  perf-profile.self.cycles-pp.do_brk_flags
>     25.26            +0.9       26.19        perf-profile.self.cycles-pp.intel_idle
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Xiaolong
> <config-4.14.0-01234-g63e02a2>
> <job.yaml>
> <reproduce>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ