lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de61a3b5b6b1467ea7d377377c443572@hisilicon.com>
Date:   Tue, 27 Apr 2021 09:20:31 +0000
From:   "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>
To:     kernel test robot <oliver.sang@...el.com>
CC:     Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        "lkp@...ts.01.org" <lkp@...ts.01.org>,
        "lkp@...el.com" <lkp@...el.com>,
        "ying.huang@...el.com" <ying.huang@...el.com>,
        "feng.tang@...el.com" <feng.tang@...el.com>,
        "zhengjun.xing@...el.com" <zhengjun.xing@...el.com>
Subject: RE: [genirq]  cbe16f35be:  will-it-scale.per_thread_ops -5.2%
 regression



> -----Original Message-----
> From: kernel test robot [mailto:oliver.sang@...el.com]
> Sent: Tuesday, April 27, 2021 9:00 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@...ilicon.com>
> Cc: Ingo Molnar <mingo@...nel.org>; Thomas Gleixner <tglx@...utronix.de>; LKML
> <linux-kernel@...r.kernel.org>; lkp@...ts.01.org; lkp@...el.com;
> ying.huang@...el.com; feng.tang@...el.com; zhengjun.xing@...el.com
> Subject: [genirq] cbe16f35be: will-it-scale.per_thread_ops -5.2% regression
> 
> 
> Greeting,
> 
> FYI, we noticed a -5.2% regression of will-it-scale.per_thread_ops due to
> commit:
> 
> 
> commit: cbe16f35bee6880becca6f20d2ebf6b457148552 ("genirq: Add IRQF_NO_AUTOEN
> for request_irq/nmi()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 

Might be relevant. Can't figure out the relation between getppid and
request_irq().

Thanks
Barry

> 
> in testcase: will-it-scale
> on test machine: 88 threads Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G
> memory
> with following parameters:
> 
> 	nr_task: 100%
> 	mode: thread
> 	test: getppid1
> 	cpufreq_governor: performance
> 	ucode: 0x5003006
> 
> test-description: Will It Scale takes a testcase and runs it from 1 through
> to n parallel copies to see if the testcase will scale. It builds both a process
> and threads based test in order to see any differences between the two.
> test-url: https://github.com/antonblanchard/will-it-scale
> 
> 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot <oliver.sang@...el.com>
> 
> 
> Details are as below:
> --------------------------------------------------------------------------
> ------------------------>
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install                job.yaml  # job file is attached in this
> email
>         bin/lkp split-job --compatible job.yaml
>         bin/lkp run                    compatible-job.yaml
> 
> ==========================================================================
> ===============
> compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/test
> case/ucode:
> 
> gcc-9/performance/x86_64-rhel-8.3/thread/100%/debian-10.4-x86_64-20200603.
> cgz/lkp-csl-2sp9/getppid1/will-it-scale/0x5003006
> 
> commit:
>   v5.12-rc2
>   cbe16f35be ("genirq: Add IRQF_NO_AUTOEN for request_irq/nmi()")
> 
>        v5.12-rc2 cbe16f35bee6880becca6f20d2e
> ---------------- ---------------------------
>          %stddev     %change         %stddev
>              \          |                \
>  7.408e+08            -5.2%  7.021e+08        will-it-scale.88.threads
>    8417726            -5.2%    7978644        will-it-scale.per_thread_ops
>  7.408e+08            -5.2%  7.021e+08        will-it-scale.workload
>  3.851e+10            -5.2%   3.65e+10        perf-stat.i.branch-instructions
>  1.839e+08            -4.2%  1.763e+08        perf-stat.i.branch-misses
>       1.39            +5.3%       1.46        perf-stat.i.cpi
>  5.988e+10            -5.2%  5.674e+10        perf-stat.i.dTLB-loads
>  4.139e+10            -5.2%  3.922e+10        perf-stat.i.dTLB-stores
>  2.239e+08 ±  3%     -14.6%  1.913e+08 ±  4%  perf-stat.i.iTLB-load-misses
>  1.741e+11            -5.2%   1.65e+11        perf-stat.i.instructions
>     794.20 ±  2%     +10.7%     879.14 ±  4%
> perf-stat.i.instructions-per-iTLB-miss
>       0.72            -5.0%       0.68        perf-stat.i.ipc
>       1588            -5.2%       1505        perf-stat.i.metric.M/sec
>       1.39            +5.4%       1.47        perf-stat.overall.cpi
>       0.00            +0.0        0.00
> perf-stat.overall.dTLB-store-miss-rate%
>     778.31 ±  3%     +11.1%     864.35 ±  4%
> perf-stat.overall.instructions-per-iTLB-miss
>       0.72            -5.1%       0.68        perf-stat.overall.ipc
>  3.838e+10            -5.2%  3.638e+10        perf-stat.ps.branch-instructions
>  1.833e+08            -4.2%  1.757e+08        perf-stat.ps.branch-misses
>  5.968e+10            -5.2%  5.655e+10        perf-stat.ps.dTLB-loads
>  4.125e+10            -5.2%  3.909e+10        perf-stat.ps.dTLB-stores
>  2.231e+08 ±  3%     -14.6%  1.907e+08 ±  4%  perf-stat.ps.iTLB-load-misses
>  1.735e+11            -5.2%  1.645e+11        perf-stat.ps.instructions
>  5.243e+13            -5.2%  4.971e+13        perf-stat.total.instructions
>      43.20            -2.3       40.87
> perf-profile.calltrace.cycles-pp.__entry_text_start.getppid
>       9.09            -0.4        8.65 ±  2%
> perf-profile.calltrace.cycles-pp.testcase
>       8.01            -0.3        7.66
> perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.getppid
>       2.93            -0.2        2.73
> perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_6
> 4.entry_SYSCALL_64_after_hwframe.getppid
>       3.02            -0.2        2.83
> perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.getppid
>       2.52 ±  3%      +0.7        3.23 ±  9%
> perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall
> _exit_to_user_mode.entry_SYSCALL_64_after_hwframe.getppid
>      18.24            +1.1       19.29
> perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwfr
> ame.getppid
>      13.33            +1.1       14.46 ±  2%
> perf-profile.calltrace.cycles-pp.__x64_sys_getppid.do_syscall_64.entry_SYS
> CALL_64_after_hwframe.getppid
>       1.94 ±  2%      +1.7        3.62 ±  3%
> perf-profile.calltrace.cycles-pp.rcu_nocb_flush_deferred_wakeup.exit_to_us
> er_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.g
> etppid
>       7.47 ±  2%      +1.8        9.25
> perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to
> _user_mode.entry_SYSCALL_64_after_hwframe.getppid
>      11.51 ±  2%      +2.3       13.81 ±  2%
> perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.entry_SYSCALL_6
> 4_after_hwframe.getppid
>      39.05            +3.1       42.17
> perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.getppid
>      27.84            -1.5       26.36
> perf-profile.children.cycles-pp.__entry_text_start
>      23.40            -1.2       22.22
> perf-profile.children.cycles-pp.syscall_return_via_sysret
>       2.97            -0.2        2.76
> perf-profile.children.cycles-pp.syscall_enter_from_user_mode
>       3.23            -0.2        3.03
> perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
>       0.41 ±  2%      -0.0        0.37 ±  2%
> perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
>       2.56 ±  3%      +0.7        3.25 ±  7%
> perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
>      13.88            +1.2       15.05 ±  2%
> perf-profile.children.cycles-pp.__x64_sys_getppid
>       2.02 ±  2%      +1.6        3.67 ±  3%
> perf-profile.children.cycles-pp.rcu_nocb_flush_deferred_wakeup
>       7.86            +1.7        9.60
> perf-profile.children.cycles-pp.exit_to_user_mode_prepare
>      12.71            +2.3       15.04 ±  2%
> perf-profile.children.cycles-pp.syscall_exit_to_user_mode
>      39.65            +3.1       42.77
> perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
>      23.19            -1.2       22.01
> perf-profile.self.cycles-pp.syscall_return_via_sysret
>      19.57            -1.1       18.50        perf-profile.self.cycles-pp.getppid
>      12.47            -0.6       11.83
> perf-profile.self.cycles-pp.__entry_text_start
>       8.60            -0.2        8.36
> perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
>       3.22            -0.2        3.02
> perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
>       2.52            -0.2        2.35
> perf-profile.self.cycles-pp.syscall_enter_from_user_mode
>       3.58            +0.4        3.95
> perf-profile.self.cycles-pp.__x64_sys_getppid
>       2.12 ±  3%      +0.6        2.74 ±  8%
> perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
>       1.72 ±  2%      +1.7        3.41 ±  3%
> perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
> 
> 
> 
>                              will-it-scale.per_thread_ops
> 
>   8.8e+06
> +-----------------------------------------------------------------+
>   8.7e+06 |-+        .+.                                                    |
>           |        .+   +                                                   |
>   8.6e+06 |.+.+.+.+      +  .+. .+.+. .+.+. .+.+.+.+.                       |
>   8.5e+06 |-+             ++   +     +     +         +.                     |
>           |                                            +.+.++.+.+.+.+.+.+.+.|
>   8.4e+06 |-+                                                               |
>   8.3e+06 |-O O O O O O   OO                                                |
>   8.2e+06 |-+           O    O                                              |
>           |                                                                 |
>   8.1e+06 |-+                                                               |
>     8e+06 |-+                  O O O O O O O O O O   O      O       O O     |
>           |                                        O          O O O         |
>   7.9e+06 |-+                                          O O O                |
>   7.8e+06
> +-----------------------------------------------------------------+
> 
> 
> [*] bisect-good sample
> [O] bisect-bad  sample
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> ---
> 0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
> https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel
> Corporation
> 
> Thanks,
> Oliver Sang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ