lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKfTPtBz+Li1i5fdhrnju-oP4BkFg_FNFjYD-YoAA26E6JBRdw@mail.gmail.com>
Date:   Tue, 9 Jan 2018 08:58:11 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     kernel test robot <xiaolong.ye@...el.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ben Segall <bsegall@...gle.com>, Chris Mason <clm@...com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Josef Bacik <josef@...icpanda.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mike Galbraith <efault@....de>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Paul Turner <pjt@...gle.com>, Tejun Heo <tj@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Yuyang Du <yuyang.du@...el.com>,
        LKML <linux-kernel@...r.kernel.org>, LKP <lkp@...org>
Subject: Re: [lkp-robot] [sched/fair] a4c3c04974: unixbench.score -4.3% regression

Hi,

On 8 January 2018 at 10:34, Vincent Guittot <vincent.guittot@...aro.org> wrote:
> Hi Xiaolong,
>
> On 25 December 2017 at 07:07, kernel test robot <xiaolong.ye@...el.com> wrote:
>>
>> Greeting,
>>
>> FYI, we noticed a -4.3% regression of unixbench.score due to commit:
>>
>>
>> commit: a4c3c04974d648ee6e1a09ef4131eb32a02ab494 ("sched/fair: Update and fix the runnable propagation rule")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>
>> in testcase: unixbench
>> on test machine: 8 threads Ivy Bridge with 16G memory
>> with following parameters:
>>
>>         runtime: 300s
>>         nr_task: 100%
>>         test: shell1
>>         cpufreq_governor: performance
>>
>> test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
>> test-url: https://github.com/kdlucas/byte-unixbench
>>
>
> I don't have the machine described above so i have tried to reproduce
> the problem on my 8 cores cortex A53 platform but I don't have
> performance regression.
> I have also tried with a VM on a Intel(R) Core(TM) i7-4810MQ and
> haven't seen regression too.
>
> Have you seen the regression on other platform ?

I have been able to run the test on a 12 cores Intel(R) Xeon(R) CPU
E5-2630 and haven't seen any regression as well
I have changed the command to ./Run Shell1 -c 12 -i 30 instead of
./Run Shell1 -c 8 -i 30 as there were more cores

Regards,
Vincent

>
> Regards,
> Vincent
>
>>
>> Details are as below:
>> -------------------------------------------------------------------------------------------------->
>>
>>
>> To reproduce:
>>
>>         git clone https://github.com/intel/lkp-tests.git
>>         cd lkp-tests
>>         bin/lkp install job.yaml  # job file is attached in this email
>>         bin/lkp run     job.yaml
>>
>> =========================================================================================
>> compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase:
>>   gcc-7/performance/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/300s/lkp-ivb-d01/shell1/unixbench
>>
>> commit:
>>   c6b9d9a330 ("sched/wait: Fix add_wait_queue() behavioral change")
>>   a4c3c04974 ("sched/fair: Update and fix the runnable propagation rule")
>>
>> c6b9d9a330290144 a4c3c04974d648ee6e1a09ef41
>> ---------------- --------------------------
>>          %stddev     %change         %stddev
>>              \          |                \
>>      13264            -4.3%      12694        unixbench.score
>>   10619292           -11.7%    9374917        unixbench.time.involuntary_context_switches
>>  4.829e+08            -4.3%   4.62e+08        unixbench.time.minor_page_faults
>>       1126            -3.6%       1086        unixbench.time.system_time
>>       2645            -3.0%       2566        unixbench.time.user_time
>>   15855720            -6.2%   14878247        unixbench.time.voluntary_context_switches
>>       0.00 ą 56%      -0.0        0.00 ą 57%  mpstat.cpu.iowait%
>>      79517            -5.7%      74990        vmstat.system.cs
>>      16361            -3.3%      15822        vmstat.system.in
>>  1.814e+08           -24.0%  1.379e+08        cpuidle.C1.time
>>    3436399           -20.6%    2728227        cpuidle.C1.usage
>>    7772815            -9.9%    7001076        cpuidle.C1E.usage
>>  1.479e+08           +66.1%  2.456e+08        cpuidle.C3.time
>>    1437889           +38.7%    1994073        cpuidle.C3.usage
>>      18147           +13.9%      20676        cpuidle.POLL.usage
>>    3436173           -20.6%    2727580        turbostat.C1
>>       3.54            -0.8        2.73        turbostat.C1%
>>    7772758            -9.9%    7001012        turbostat.C1E
>>    1437858           +38.7%    1994034        turbostat.C3
>>       2.88            +2.0        4.86        turbostat.C3%
>>      18.50           +10.8%      20.50        turbostat.CPU%c1
>>       0.54 ą  2%    +179.6%       1.51        turbostat.CPU%c3
>>   2.32e+12            -4.3%   2.22e+12        perf-stat.branch-instructions
>>  6.126e+10            -4.9%  5.823e+10        perf-stat.branch-misses
>>       8.64 ą  4%      +0.6        9.25        perf-stat.cache-miss-rate%
>>  1.662e+11            -4.3%   1.59e+11        perf-stat.cache-references
>>   51040611            -7.0%   47473754        perf-stat.context-switches
>>  1.416e+13            -3.6%  1.365e+13        perf-stat.cpu-cycles
>>    8396968            -3.9%    8065835        perf-stat.cpu-migrations
>>  2.919e+12            -4.3%  2.793e+12        perf-stat.dTLB-loads
>>   1.89e+12            -4.3%  1.809e+12        perf-stat.dTLB-stores
>>      67.97            +1.1       69.03        perf-stat.iTLB-load-miss-rate%
>>  4.767e+09            -1.3%  4.704e+09        perf-stat.iTLB-load-misses
>>  2.247e+09            -6.0%  2.111e+09        perf-stat.iTLB-loads
>>   1.14e+13            -4.3%  1.091e+13        perf-stat.instructions
>>       2391            -3.0%       2319        perf-stat.instructions-per-iTLB-miss
>>  4.726e+08            -4.3%  4.523e+08        perf-stat.minor-faults
>>  4.726e+08            -4.3%  4.523e+08        perf-stat.page-faults
>>     585.14 ą  4%     -55.0%     263.59 ą 12%  sched_debug.cfs_rq:/.load_avg.avg
>>       1470 ą  4%     -42.2%     850.09 ą 24%  sched_debug.cfs_rq:/.load_avg.max
>>     154.17 ą 22%     -49.2%      78.39 ą  7%  sched_debug.cfs_rq:/.load_avg.min
>>     438.33 ą  6%     -41.9%     254.49 ą 27%  sched_debug.cfs_rq:/.load_avg.stddev
>>       2540 ą 15%     +23.5%       3137 ą 11%  sched_debug.cfs_rq:/.removed.runnable_sum.avg
>>     181.83 ą 11%     -56.3%      79.50 ą 34%  sched_debug.cfs_rq:/.runnable_load_avg.avg
>>      16.46 ą 37%     -72.9%       4.45 ą110%  sched_debug.cfs_rq:/.runnable_load_avg.min
>>     294.77 ą  5%     +11.2%     327.87 ą  6%  sched_debug.cfs_rq:/.util_avg.stddev
>>     220260 ą  8%     +20.3%     264870 ą  4%  sched_debug.cpu.avg_idle.avg
>>     502903 ą  4%     +21.0%     608663        sched_debug.cpu.avg_idle.max
>>     148667 ą  6%     +29.5%     192468 ą  2%  sched_debug.cpu.avg_idle.stddev
>>     180.64 ą 10%     -53.4%      84.23 ą 34%  sched_debug.cpu.cpu_load[0].avg
>>      25.73 ą 15%     -85.6%       3.70 ą113%  sched_debug.cpu.cpu_load[0].min
>>     176.98 ą  6%     -52.5%      84.06 ą 35%  sched_debug.cpu.cpu_load[1].avg
>>      53.93 ą 13%     -72.6%      14.75 ą 15%  sched_debug.cpu.cpu_load[1].min
>>     176.61 ą  4%     -55.3%      78.92 ą 31%  sched_debug.cpu.cpu_load[2].avg
>>      73.78 ą 11%     -73.4%      19.61 ą  7%  sched_debug.cpu.cpu_load[2].min
>>     177.42 ą  3%     -58.8%      73.09 ą 21%  sched_debug.cpu.cpu_load[3].avg
>>      93.01 ą  8%     -73.9%      24.25 ą  6%  sched_debug.cpu.cpu_load[3].min
>>     173.36 ą  3%     -60.6%      68.26 ą 13%  sched_debug.cpu.cpu_load[4].avg
>>     274.36 ą  5%     -48.6%     141.16 ą 44%  sched_debug.cpu.cpu_load[4].max
>>     107.87 ą  6%     -73.0%      29.11 ą  9%  sched_debug.cpu.cpu_load[4].min
>>      11203 ą  9%      +9.9%      12314 ą  6%  sched_debug.cpu.curr->pid.avg
>>    1042556 ą  3%      -6.9%     970165 ą  2%  sched_debug.cpu.sched_goidle.max
>>     748905 ą  5%     -13.4%     648459        sched_debug.cpu.sched_goidle.min
>>      90872 ą 11%     +17.4%     106717 ą  5%  sched_debug.cpu.sched_goidle.stddev
>>     457847 ą  4%     -15.0%     389113        sched_debug.cpu.ttwu_local.min
>>      18.60            -1.1       17.45        perf-profile.calltrace.cycles-pp.secondary_startup_64
>>      16.33 ą  2%      -1.0       15.29        perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
>>      16.33 ą  2%      -1.0       15.29        perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
>>      16.32 ą  2%      -1.0       15.29        perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>>      15.44 ą  2%      -1.0       14.43        perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
>>      15.69 ą  2%      -1.0       14.71        perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>>       5.54            -0.1        5.45        perf-profile.calltrace.cycles-pp.__libc_fork
>>      10.28            +0.0       10.32        perf-profile.calltrace.cycles-pp.page_fault
>>      10.16            +0.0       10.21        perf-profile.calltrace.cycles-pp.do_page_fault.page_fault
>>      10.15            +0.1       10.20        perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault
>>       9.47            +0.1        9.56        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>>      11.49            +0.1       11.59        perf-profile.calltrace.cycles-pp.sys_execve.do_syscall_64.return_from_SYSCALL_64.execve
>>       8.28            +0.1        8.38        perf-profile.calltrace.cycles-pp.load_elf_binary.search_binary_handler.do_execveat_common.sys_execve.do_syscall_64
>>      11.49            +0.1       11.59        perf-profile.calltrace.cycles-pp.return_from_SYSCALL_64.execve
>>      11.49            +0.1       11.59        perf-profile.calltrace.cycles-pp.do_syscall_64.return_from_SYSCALL_64.execve
>>       8.30            +0.1        8.41        perf-profile.calltrace.cycles-pp.search_binary_handler.do_execveat_common.sys_execve.do_syscall_64.return_from_SYSCALL_64
>>      11.46            +0.1       11.58        perf-profile.calltrace.cycles-pp.do_execveat_common.sys_execve.do_syscall_64.return_from_SYSCALL_64.execve
>>       8.46            +0.1        8.57        perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>>       5.21            +0.1        5.34 ą  2%  perf-profile.calltrace.cycles-pp.exit_mmap.mmput.do_exit.do_group_exit.__wake_up_parent
>>       5.24            +0.1        5.38 ą  2%  perf-profile.calltrace.cycles-pp.mmput.do_exit.do_group_exit.__wake_up_parent.entry_SYSCALL_64_fastpath
>>      13.20            +0.1       13.34        perf-profile.calltrace.cycles-pp.execve
>>       6.79            +0.2        6.94 ą  2%  perf-profile.calltrace.cycles-pp.__wake_up_parent.entry_SYSCALL_64_fastpath
>>       6.79            +0.2        6.95 ą  2%  perf-profile.calltrace.cycles-pp.do_group_exit.__wake_up_parent.entry_SYSCALL_64_fastpath
>>       6.78            +0.2        6.94        perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__wake_up_parent.entry_SYSCALL_64_fastpath
>>       5.98            +0.2        6.18        perf-profile.calltrace.cycles-pp.vfprintf.__vsnprintf_chk
>>       8.38            +0.2        8.61        perf-profile.calltrace.cycles-pp.__vsnprintf_chk
>>      14.17            +0.3       14.49        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_fastpath
>>      18.60            -1.1       17.45        perf-profile.children.cycles-pp.do_idle
>>      18.60            -1.1       17.45        perf-profile.children.cycles-pp.cpu_startup_entry
>>      18.60            -1.1       17.45        perf-profile.children.cycles-pp.secondary_startup_64
>>      17.60            -1.1       16.46        perf-profile.children.cycles-pp.intel_idle
>>      17.89            -1.1       16.80        perf-profile.children.cycles-pp.cpuidle_enter_state
>>      16.33 ą  2%      -1.0       15.29        perf-profile.children.cycles-pp.start_secondary
>>       5.54            -0.1        5.45        perf-profile.children.cycles-pp.__libc_fork
>>      16.15            +0.0       16.18        perf-profile.children.cycles-pp.do_page_fault
>>      16.19            +0.0       16.22        perf-profile.children.cycles-pp.page_fault
>>       6.24            +0.1        6.29 ą  2%  perf-profile.children.cycles-pp.filemap_map_pages
>>      16.07            +0.1       16.13        perf-profile.children.cycles-pp.__do_page_fault
>>      16.85            +0.1       16.92        perf-profile.children.cycles-pp.do_syscall_64
>>      16.85            +0.1       16.92        perf-profile.children.cycles-pp.return_from_SYSCALL_64
>>       9.22            +0.1        9.33        perf-profile.children.cycles-pp.search_binary_handler
>>      13.49            +0.1       13.61        perf-profile.children.cycles-pp.__handle_mm_fault
>>       4.89            +0.1        5.02 ą  2%  perf-profile.children.cycles-pp.unmap_page_range
>>       9.11            +0.1        9.24        perf-profile.children.cycles-pp.load_elf_binary
>>      13.20            +0.1       13.34        perf-profile.children.cycles-pp.execve
>>      12.82            +0.1       12.96        perf-profile.children.cycles-pp.sys_execve
>>       4.95            +0.2        5.10 ą  2%  perf-profile.children.cycles-pp.unmap_vmas
>>      12.79            +0.2       12.95        perf-profile.children.cycles-pp.do_execveat_common
>>      13.90            +0.2       14.07        perf-profile.children.cycles-pp.handle_mm_fault
>>       6.95            +0.2        7.13 ą  2%  perf-profile.children.cycles-pp.do_exit
>>       6.95            +0.2        7.13 ą  2%  perf-profile.children.cycles-pp.do_group_exit
>>       6.95            +0.2        7.13 ą  2%  perf-profile.children.cycles-pp.__wake_up_parent
>>       6.40 ą  2%      +0.2        6.62        perf-profile.children.cycles-pp.vfprintf
>>       8.38            +0.2        8.61        perf-profile.children.cycles-pp.__vsnprintf_chk
>>       9.21            +0.2        9.46        perf-profile.children.cycles-pp.mmput
>>       9.16            +0.2        9.41        perf-profile.children.cycles-pp.exit_mmap
>>      19.85            +0.3       20.13        perf-profile.children.cycles-pp.entry_SYSCALL_64_fastpath
>>      17.60            -1.1       16.46        perf-profile.self.cycles-pp.intel_idle
>>       6.03 ą  2%      +0.2        6.26        perf-profile.self.cycles-pp.vfprintf
>>
>>
>>
>>                                    unixbench.score
>>
>>   14000 +-+-----------------------------------------------------------------+
>>         O.O..O.O.O.O..O.O.O.O..O.O.O.O..O.O.O..O.O.O.+    +.+.+..+.+.+.+..+.|
>>   12000 +-+                                          :    :                 |
>>         |                                            :    :                 |
>>   10000 +-+                                           :   :                 |
>>         |                                             :  :                  |
>>    8000 +-+                                           :  :                  |
>>         |                                             :  :                  |
>>    6000 +-+                                            : :                  |
>>         |                                              : :                  |
>>    4000 +-+                                            : :                  |
>>         |                                              ::                   |
>>    2000 +-+                                             :                   |
>>         |                                               :                   |
>>       0 +-+-----------------------------------------------------------------+
>>
>>
>>
>> Disclaimer:
>> Results have been estimated based on internal Intel analysis and are provided
>> for informational purposes only. Any difference in system hardware or software
>> design or configuration may affect actual performance.
>>
>>
>> Thanks,
>> Xiaolong

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ