lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 2 Jan 2017 15:56:37 +0100
From:   Vincent Guittot <vincent.guittot@...aro.org>
To:     kernel test robot <xiaolong.ye@...el.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: Re: [lkp-developer] [sched/core]  6b94780e45:  unixbench.score -4.5%
 regression

Hi Xiaolong,

Le Monday 19 Dec 2016 à 08:14:53 (+0800), kernel test robot a écrit :
> 
> Greeting,
> 
> FYI, we noticed a -4.5% regression of unixbench.score due to commit:

I have been able to restore performance on my platform with the patch below.
Could you test it ?

---
 kernel/sched/core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 393759b..6e7d45c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2578,6 +2578,7 @@ void wake_up_new_task(struct task_struct *p)
 	__set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0));
 #endif
 	rq = __task_rq_lock(p, &rf);
+	update_rq_clock(rq);
 	post_init_entity_util_avg(&p->se);
 
 	activate_task(rq, p, 0);
-- 
2.7.4

Vincent

> 
> 
> commit: 6b94780e45c17b83e3e75f8aaca5a328db583c74 ("sched/core: Use load_avg for selecting idlest group")
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> 
> in testcase: unixbench
> on test machine: 24 threads Nehalem-EP with 24G memory
> with following parameters:
> 
> 	runtime: 300s
> 	nr_task: 100%
> 	test: shell1
> 	cpufreq_governor: performance
> 
> test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
> test-url: https://github.com/kdlucas/byte-unixbench
> 
> In addition to that, the commit also has significant impact on the following tests:
> 
> +------------------+-----------------------------------------------------------------------+
> | testcase: change | unixbench: unixbench.score -2.9% regression                           |
> | test machine     | 8 threads Intel(R) Core(TM) i7 CPU 870 @ 2.93GHz with 6G memory       |
> | test parameters  | nr_task=1                                                             |
> |                  | runtime=300s                                                          |
> |                  | test=shell8                                                           |
> +------------------+-----------------------------------------------------------------------+
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> testcase/path_params/tbox_group/run: unixbench/300s-100%-shell1-performance/lkp-wsm-ep1
> 
> f519a3f1c6b7a990  6b94780e45c17b83e3e75f8aac  
> ----------------  --------------------------  
>      25565              -5%      24414        unixbench.score
>   29557557                    28781098        unixbench.time.voluntary_context_switches
>       5743              -4%       5514        unixbench.time.user_time
>  9.232e+08              -4%  8.831e+08        unixbench.time.minor_page_faults
>       1807              -5%       1709        unixbench.time.percent_of_cpu_this_job_got
>       5656              -7%       5271        unixbench.time.system_time
>   13223805             -20%   10628072        unixbench.time.involuntary_context_switches
>     741766             -62%     279054        interrupts.CAL:Function_call_interrupts
>      31060              -9%      28214        vmstat.system.in
>     126250             -12%     110890        vmstat.system.cs
>      78.58              -6%      74.20        turbostat.%Busy
>       2507              -6%       2366        turbostat.Avg_MHz
>       9134 ± 47%     -6e+03       2973 ± 36%  latency_stats.max.pipe_read.__vfs_read.vfs_read.SyS_read.entry_SYSCALL_64_fastpath
>     380879 ± 10%      5e+05     887692 ± 49%  latency_stats.sum.wait_on_page_bit_killable.__lock_page_or_retry.filemap_fault.__do_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>      31710 ± 15%     -2e+04      10583 ± 14%  latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.do_munmap.vm_munmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64.return_from_SYSCALL_64
>      51796 ±  4%     -4e+04      15457 ± 10%  latency_stats.sum.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.unmap_region.do_munmap.vm_munmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64
>     111998 ± 18%     -7e+04      37074 ± 14%  latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.do_munmap.mmap_region.do_mmap.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath
>     275087 ± 15%     -2e+05      81973 ±  3%  latency_stats.sum.call_rwsem_down_write_failed.unlink_file_vma.free_pgtables.unmap_region.do_munmap.mmap_region.do_mmap.vm_mmap_pgoff.SyS_mmap_pgoff.SyS_mmap.entry_SYSCALL_64_fastpath
>     930993 ± 12%     -6e+05     320520 ±  4%  latency_stats.sum.call_rwsem_down_write_failed.vma_link.mmap_region.do_mmap.vm_mmap_pgoff.vm_mmap.elf_map.load_elf_binary.search_binary_handler.do_execveat_common.SyS_execve.do_syscall_64
>    4755783 ±  9%     -3e+06    1619348 ±  4%  latency_stats.sum.call_rwsem_down_write_failed.__vma_adjust.__split_vma.split_vma.mprotect_fixup.do_mprotect_pkey.SyS_mprotect.entry_SYSCALL_64_fastpath
>    5536067 ± 10%     -4e+06    1929338 ±  3%  latency_stats.sum.call_rwsem_down_write_failed.copy_process._do_fork.SyS_clone.do_syscall_64.return_from_SYSCALL_64
>  9.032e+08              -4%   8.64e+08        perf-stat.page-faults
>  9.032e+08              -4%   8.64e+08        perf-stat.minor-faults
>  2.329e+09                   2.269e+09        perf-stat.node-load-misses
>    2.2e+09              -9%  2.011e+09 ±  5%  perf-stat.dTLB-store-misses
>  3.278e+10              -9%  2.987e+10 ±  6%  perf-stat.dTLB-load-misses
>   19484819              13%   21974129        perf-stat.cpu-migrations
>  3.755e+13              -6%   3.54e+13        perf-stat.cpu-cycles
>       3244               4%       3379        perf-stat.instructions-per-iTLB-miss
>  4.536e+12              -4%  4.332e+12        perf-stat.branch-instructions
>  2.303e+13              -4%  2.208e+13        perf-stat.instructions
>  5.768e+12              -4%  5.517e+12        perf-stat.dTLB-loads
>  3.567e+11              -4%  3.414e+11        perf-stat.cache-references
>       2.97                        2.93        perf-stat.branch-miss-rate%
>  2.768e+10                   2.699e+10        perf-stat.node-stores
>  5.446e+10              -3%  5.275e+10        perf-stat.cache-misses
>       0.03              -4%       0.03        perf-stat.iTLB-load-miss-rate%
>  9.673e+09              -4%  9.294e+09        perf-stat.node-loads
>  3.596e+12              -4%  3.442e+12        perf-stat.dTLB-stores
>       0.61                        0.62        perf-stat.ipc
>  1.347e+11              -6%   1.27e+11        perf-stat.branch-misses
>  7.098e+09              -8%  6.533e+09        perf-stat.iTLB-load-misses
>  2.309e+13              -4%  2.206e+13        perf-stat.iTLB-loads
>   79911173             -12%   70187035        perf-stat.context-switches
> 
> 
> 
>                                  turbostat._Busy
> 
>   90 ++-------------------------------------*---*---------------------------+
>      |                                    ..       *...*..                  |
>   80 *+..*..*...*..*...*..*...*..*...O...*  O   O  O   O  O...O..O...O  O   O
>   70 O+  O  O   O  O   O  O   O  O                                          |
>      |                                                                      |
>   60 ++                                                                     |
>   50 ++                                                                     |
>      |                                                                      |
>   40 ++                                                                     |
>   30 ++                                                                     |
>      |                                                                      |
>   20 ++                                                                     |
>   10 ++                                                                     |
>      |                                                                      |
>    0 ++----------------------------------O----------------------------------+
> 
> 
> 
> 
> 
>                     unixbench.time.percent_of_cpu_this_job_got
> 
>   2500 ++-------------------------------------------------------------------+
>        |                                                                    |
>        |                                       .*...                        |
>   2000 ++                                   .*.     *..*...                 |
>        *..*...*..*...*..*...*..*...*..O...*. O  O   O  O   O..O...O..O   O  O
>        O  O   O  O   O  O   O  O   O                                        |
>   1500 ++                                                                   |
>        |                                                                    |
>   1000 ++                                                                   |
>        |                                                                    |
>        |                                                                    |
>    500 ++                                                                   |
>        |                                                                    |
>        |                                                                    |
>      0 ++---------------------------------O---------------------------------+
> 
> 
>                                   vmstat.system.in
> 
>   40000 ++------------------------------------------------------------------+
>         |                                          .*...*..                 |
>   35000 ++                                  .*...*.                         |
>   30000 *+.*...*..*...*..*..*...*..*...*..*.               *..*...*..*      |
>         O  O   O  O   O  O  O   O  O   O     O   O  O   O  O  O   O  O   O  O
>   25000 ++                                                                  |
>         |                                                                   |
>   20000 ++                                                                  |
>         |                                                                   |
>   15000 ++                                                                  |
>   10000 ++                                                                  |
>         |                                                                   |
>    5000 ++                                                                  |
>         |                                                                   |
>       0 ++--------------------------------O---------------------------------+
> 
> 	[*] bisect-good sample
> 	[O] bisect-bad  sample
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Xiaolong

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ