lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200205123110.GN12867@shao2-debian>
Date:   Wed, 5 Feb 2020 20:31:10 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Roman Sudarikov <roman.sudarikov@...ux.intel.com>
Cc:     0day robot <lkp@...el.com>, Kan Liang <kan.liang@...ux.intel.com>,
        Alexander Antonov <alexander.antonov@...el.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [perf x86] b77491648e: will-it-scale.per_process_ops -2.1% regression

Greeting,

FYI, we noticed a -2.1% regression of will-it-scale.per_process_ops due to commit:


commit: b77491648e6eb2f26b6edf5eaea859adc17f4dcc ("perf x86: Infrastructure for exposing an Uncore unit to PMON mapping")
https://github.com/0day-ci/linux/commits/roman-sudarikov-linux-intel-com/perf-x86-Exposing-IO-stack-to-IO-PMON-mapping-through-sysfs/20200118-075508

in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: signal1
	cpufreq_governor: performance
	ucode: 0xb000038

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-bdw-ep6/signal1/will-it-scale/0xb000038

commit: 
  v5.4
  b77491648e ("perf x86: Infrastructure for exposing an Uncore unit to PMON mapping")

            v5.4 b77491648e6eb2f26b6edf5eaea 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     47986            -2.1%      46989        will-it-scale.per_process_ops
   4222852            -2.1%    4135110        will-it-scale.workload
    427194 ±  9%     +13.8%     486344 ±  4%  numa-vmstat.node1.numa_local
     12.88 ±  2%      -8.5%      11.79 ±  4%  turbostat.RAMWatt
      8846 ± 10%     +23.9%      10964 ±  9%  softirqs.CPU0.SCHED
     14442 ±  4%      -5.2%      13697 ±  5%  softirqs.CPU71.RCU
     78696 ±  9%     +14.4%      89993 ±  8%  sched_debug.cfs_rq:/.min_vruntime.stddev
     78411 ±  9%     +14.5%      89817 ±  8%  sched_debug.cfs_rq:/.spread0.stddev
      9.77 ±  4%     +15.0%      11.23 ±  3%  sched_debug.cpu.clock.stddev
      9.77 ±  4%     +15.0%      11.23 ±  3%  sched_debug.cpu.clock_task.stddev
 4.072e+09            -1.9%  3.996e+09        perf-stat.i.branch-instructions
  44948352            -1.8%   44159252        perf-stat.i.branch-misses
     35.25            +4.3       39.56        perf-stat.i.cache-miss-rate%
  12569960            +5.2%   13223444        perf-stat.i.cache-misses
  35888855 ±  2%      -6.2%   33680305 ±  2%  perf-stat.i.cache-references
     11.75            +1.8%      11.96        perf-stat.i.cpi
     19377            -5.0%      18403        perf-stat.i.cycles-between-cache-misses
  27157347            -2.1%   26595986        perf-stat.i.dTLB-load-misses
 6.739e+09            -2.0%  6.602e+09        perf-stat.i.dTLB-loads
  27809165            -1.9%   27268405        perf-stat.i.dTLB-store-misses
 5.461e+09            -1.9%  5.356e+09        perf-stat.i.dTLB-stores
 2.072e+10            -1.9%  2.034e+10        perf-stat.i.instructions
      0.09            -1.7%       0.08        perf-stat.i.ipc
    917994            +2.6%     941599        perf-stat.i.node-load-misses
     96.93            -1.1       95.81        perf-stat.i.node-store-miss-rate%
   5499191            +5.0%    5774707        perf-stat.i.node-store-misses
    169716 ±  8%     +45.2%     246479 ±  6%  perf-stat.i.node-stores
      1.73 ±  2%      -4.4%       1.66 ±  2%  perf-stat.overall.MPKI
     35.03            +4.2       39.27        perf-stat.overall.cache-miss-rate%
     11.77            +1.8%      11.98        perf-stat.overall.cpi
     19401            -5.0%      18428        perf-stat.overall.cycles-between-cache-misses
      0.08            -1.8%       0.08        perf-stat.overall.ipc
     97.01            -1.1       95.91        perf-stat.overall.node-store-miss-rate%
 4.058e+09            -1.8%  3.983e+09        perf-stat.ps.branch-instructions
  44798305            -1.7%   44014351        perf-stat.ps.branch-misses
  12526500            +5.2%   13178368        perf-stat.ps.cache-misses
  35771706 ±  2%      -6.2%   33569906 ±  2%  perf-stat.ps.cache-references
  27063288            -2.1%   26505363        perf-stat.ps.dTLB-load-misses
 6.716e+09            -2.0%   6.58e+09        perf-stat.ps.dTLB-loads
  27712662            -1.9%   27175399        perf-stat.ps.dTLB-store-misses
 5.442e+09            -1.9%  5.338e+09        perf-stat.ps.dTLB-stores
 2.065e+10            -1.9%  2.027e+10        perf-stat.ps.instructions
    914841            +2.6%     938399        perf-stat.ps.node-load-misses
   5480102            +5.0%    5754996        perf-stat.ps.node-store-misses
    169148 ±  8%     +45.2%     245649 ±  6%  perf-stat.ps.node-stores
 6.242e+12            -1.6%  6.142e+12        perf-stat.total.instructions
    481.50 ± 26%     -41.7%     280.75 ± 28%  interrupts.37:IR-PCI-MSI.1572868-edge.eth0-TxRx-3
    772.75 ± 63%     -70.0%     231.75 ± 28%  interrupts.CPU1.RES:Rescheduling_interrupts
    481.50 ± 26%     -41.7%     280.75 ± 28%  interrupts.CPU16.37:IR-PCI-MSI.1572868-edge.eth0-TxRx-3
    954.25 ± 10%     -71.8%     269.50 ± 76%  interrupts.CPU19.RES:Rescheduling_interrupts
    932.50 ± 48%     -68.4%     294.75 ± 72%  interrupts.CPU20.RES:Rescheduling_interrupts
    583.75 ± 59%     -79.5%     119.75 ± 54%  interrupts.CPU21.RES:Rescheduling_interrupts
    513.00 ± 42%    +145.8%       1261 ± 17%  interrupts.CPU22.RES:Rescheduling_interrupts
    256.25 ± 40%    +253.9%     906.75 ± 39%  interrupts.CPU24.RES:Rescheduling_interrupts
    475.25 ± 19%    +133.5%       1109 ± 41%  interrupts.CPU26.RES:Rescheduling_interrupts
    734.50 ± 36%     +99.1%       1462 ± 26%  interrupts.CPU27.RES:Rescheduling_interrupts
    905.75 ± 48%     -64.9%     318.00 ± 85%  interrupts.CPU3.RES:Rescheduling_interrupts
    363.00 ± 35%    +114.3%     777.75 ± 26%  interrupts.CPU30.RES:Rescheduling_interrupts
      6915 ± 24%     -29.1%       4904 ± 34%  interrupts.CPU37.NMI:Non-maskable_interrupts
      6915 ± 24%     -29.1%       4904 ± 34%  interrupts.CPU37.PMI:Performance_monitoring_interrupts
    436.50 ± 48%    +166.7%       1164 ± 41%  interrupts.CPU38.RES:Rescheduling_interrupts
      6950 ± 24%     -29.1%       4926 ± 34%  interrupts.CPU39.NMI:Non-maskable_interrupts
      6950 ± 24%     -29.1%       4926 ± 34%  interrupts.CPU39.PMI:Performance_monitoring_interrupts
      6906 ± 24%     -28.9%       4910 ± 35%  interrupts.CPU41.NMI:Non-maskable_interrupts
      6906 ± 24%     -28.9%       4910 ± 35%  interrupts.CPU41.PMI:Performance_monitoring_interrupts
    216.00 ± 70%     -76.6%      50.50 ± 22%  interrupts.CPU46.RES:Rescheduling_interrupts
      2607 ± 47%     +51.4%       3948 ±  8%  interrupts.CPU50.CAL:Function_call_interrupts
      3220 ± 10%     +22.4%       3940 ±  8%  interrupts.CPU51.CAL:Function_call_interrupts
      4914 ± 34%     +59.9%       7855        interrupts.CPU56.NMI:Non-maskable_interrupts
      4914 ± 34%     +59.9%       7855        interrupts.CPU56.PMI:Performance_monitoring_interrupts
      4937 ± 34%     +59.7%       7885        interrupts.CPU58.NMI:Non-maskable_interrupts
      4937 ± 34%     +59.7%       7885        interrupts.CPU58.PMI:Performance_monitoring_interrupts
      4919 ± 34%     +59.6%       7849        interrupts.CPU59.NMI:Non-maskable_interrupts
      4919 ± 34%     +59.6%       7849        interrupts.CPU59.PMI:Performance_monitoring_interrupts
      4925 ± 34%     +59.9%       7878        interrupts.CPU61.NMI:Non-maskable_interrupts
      4925 ± 34%     +59.9%       7878        interrupts.CPU61.PMI:Performance_monitoring_interrupts
      4906 ± 33%     +60.3%       7867        interrupts.CPU63.NMI:Non-maskable_interrupts
      4906 ± 33%     +60.3%       7867        interrupts.CPU63.PMI:Performance_monitoring_interrupts
    890.00 ± 75%     -82.0%     160.00 ± 46%  interrupts.CPU63.RES:Rescheduling_interrupts
    135.00 ± 52%    +911.7%       1365 ± 76%  interrupts.CPU70.RES:Rescheduling_interrupts
    110.25 ± 14%    +388.7%     538.75 ± 30%  interrupts.CPU71.RES:Rescheduling_interrupts
      3285 ±  3%     +15.4%       3791 ±  3%  interrupts.CPU73.CAL:Function_call_interrupts
    186.50 ± 60%    +274.4%     698.25 ± 77%  interrupts.CPU81.RES:Rescheduling_interrupts
      1.22 ±  2%      -0.2        1.02        perf-profile.calltrace.cycles-pp.recalc_sigpending.dequeue_signal.get_signal.do_signal.exit_to_usermode_loop
      3.95            -0.2        3.79        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64
      4.07            -0.2        3.92        perf-profile.calltrace.cycles-pp.syscall_return_via_sysret
      1.93            -0.1        1.79        perf-profile.calltrace.cycles-pp.fpu__clear.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.66            -0.1        0.59 ±  3%  perf-profile.calltrace.cycles-pp.__set_task_blocked.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.65 ±  2%      -0.1        0.57 ±  2%  perf-profile.calltrace.cycles-pp.recalc_sigpending.__set_task_blocked.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64
      0.85            -0.1        0.79 ±  2%  perf-profile.calltrace.cycles-pp.__set_current_blocked.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.03            -0.0        0.98        perf-profile.calltrace.cycles-pp.signal_setup_done.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.81            -0.0        0.76 ±  2%  perf-profile.calltrace.cycles-pp.fpregs_mark_activate.fpu__clear.do_signal.exit_to_usermode_loop.do_syscall_64
      0.98            -0.0        0.94        perf-profile.calltrace.cycles-pp.__set_current_blocked.signal_setup_done.do_signal.exit_to_usermode_loop.do_syscall_64
      1.10            -0.0        1.07        perf-profile.calltrace.cycles-pp.copy_fpstate_to_sigframe.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.52            +0.0        0.55 ±  3%  perf-profile.calltrace.cycles-pp.fpregs_mark_activate.__fpu__restore_sig.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.87            +0.1        1.95        perf-profile.calltrace.cycles-pp.__fpu__restore_sig.__x64_sys_rt_sigreturn.do_syscall_64.entry_SYSCALL_64_after_hwframe
     23.79            +0.3       24.06        perf-profile.calltrace.cycles-pp.__sigqueue_alloc.__send_signal.do_send_sig_info.do_send_specific.do_tkill
     24.02            +0.3       24.29        perf-profile.calltrace.cycles-pp.__send_signal.do_send_sig_info.do_send_specific.do_tkill.__x64_sys_tgkill
     89.84            +0.3       90.14        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     89.46            +0.3       89.78        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     36.84            +0.4       37.20        perf-profile.calltrace.cycles-pp.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     36.43            +0.4       36.80        perf-profile.calltrace.cycles-pp.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     25.64            +0.4       26.09        perf-profile.calltrace.cycles-pp.__x64_sys_tgkill.do_syscall_64.entry_SYSCALL_64_after_hwframe
     25.58            +0.5       26.04        perf-profile.calltrace.cycles-pp.do_tkill.__x64_sys_tgkill.do_syscall_64.entry_SYSCALL_64_after_hwframe
     25.35            +0.5       25.82        perf-profile.calltrace.cycles-pp.do_send_specific.do_tkill.__x64_sys_tgkill.do_syscall_64.entry_SYSCALL_64_after_hwframe
     24.66            +0.5       25.18        perf-profile.calltrace.cycles-pp.do_send_sig_info.do_send_specific.do_tkill.__x64_sys_tgkill.do_syscall_64
     30.40            +0.6       30.97        perf-profile.calltrace.cycles-pp.dequeue_signal.get_signal.do_signal.exit_to_usermode_loop.do_syscall_64
     31.58            +0.6       32.18        perf-profile.calltrace.cycles-pp.get_signal.do_signal.exit_to_usermode_loop.do_syscall_64.entry_SYSCALL_64_after_hwframe
     29.13            +0.8       29.91        perf-profile.calltrace.cycles-pp.__dequeue_signal.dequeue_signal.get_signal.do_signal.exit_to_usermode_loop
     28.90            +0.8       29.68        perf-profile.calltrace.cycles-pp.__sigqueue_free.__dequeue_signal.dequeue_signal.get_signal.do_signal
      3.46            -0.2        3.21 ±  2%  perf-profile.children.cycles-pp.recalc_sigpending
      3.95            -0.2        3.79        perf-profile.children.cycles-pp.entry_SYSCALL_64
      4.42            -0.2        4.26        perf-profile.children.cycles-pp.syscall_return_via_sysret
      1.93            -0.1        1.80 ±  2%  perf-profile.children.cycles-pp.fpu__clear
      3.62            -0.1        3.54        perf-profile.children.cycles-pp.__set_current_blocked
      0.27            -0.1        0.21 ±  3%  perf-profile.children.cycles-pp.fpregs_assert_state_consistent
      0.84            -0.0        0.79 ±  2%  perf-profile.children.cycles-pp._copy_from_user
      1.03            -0.0        0.99        perf-profile.children.cycles-pp.signal_setup_done
      0.34            -0.0        0.30 ±  5%  perf-profile.children.cycles-pp.restore_altstack
      0.73            -0.0        0.70        perf-profile.children.cycles-pp.__might_fault
      1.11            -0.0        1.08        perf-profile.children.cycles-pp.copy_fpstate_to_sigframe
      0.37 ±  2%      -0.0        0.35        perf-profile.children.cycles-pp.___might_sleep
      0.27            -0.0        0.26        perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      1.89            +0.1        1.96        perf-profile.children.cycles-pp.__fpu__restore_sig
      0.29 ±  7%      +0.2        0.53 ±  6%  perf-profile.children.cycles-pp.__lock_task_sighand
      0.29 ±  7%      +0.2        0.53 ±  5%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     24.03            +0.3       24.29        perf-profile.children.cycles-pp.__send_signal
     23.80            +0.3       24.06        perf-profile.children.cycles-pp.__sigqueue_alloc
     90.00            +0.3       90.30        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     89.60            +0.3       89.92        perf-profile.children.cycles-pp.do_syscall_64
     36.86            +0.4       37.22        perf-profile.children.cycles-pp.exit_to_usermode_loop
     36.45            +0.4       36.81        perf-profile.children.cycles-pp.do_signal
     25.65            +0.4       26.10        perf-profile.children.cycles-pp.__x64_sys_tgkill
     25.59            +0.5       26.04        perf-profile.children.cycles-pp.do_tkill
     25.36            +0.5       25.82        perf-profile.children.cycles-pp.do_send_specific
     24.67            +0.5       25.19        perf-profile.children.cycles-pp.do_send_sig_info
     30.41            +0.6       30.98        perf-profile.children.cycles-pp.dequeue_signal
     31.60            +0.6       32.20        perf-profile.children.cycles-pp.get_signal
     29.14            +0.8       29.92        perf-profile.children.cycles-pp.__dequeue_signal
     28.90            +0.8       29.69        perf-profile.children.cycles-pp.__sigqueue_free
     19.11            -0.4       18.75        perf-profile.self.cycles-pp.do_syscall_64
      2.58            -0.2        2.34        perf-profile.self.cycles-pp.recalc_sigpending
      3.95            -0.2        3.79        perf-profile.self.cycles-pp.entry_SYSCALL_64
      4.41            -0.2        4.25        perf-profile.self.cycles-pp.syscall_return_via_sysret
      0.95            -0.1        0.86 ±  2%  perf-profile.self.cycles-pp.fpu__clear
      0.25            -0.1        0.19 ±  2%  perf-profile.self.cycles-pp.fpregs_assert_state_consistent
      0.15 ±  2%      -0.0        0.12 ±  6%  perf-profile.self.cycles-pp._copy_from_user
      0.74            -0.0        0.71        perf-profile.self.cycles-pp.copy_fpstate_to_sigframe
      0.34            -0.0        0.31        perf-profile.self.cycles-pp.__x64_sys_rt_sigprocmask
      0.46 ±  2%      -0.0        0.44 ±  2%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.36 ±  3%      -0.0        0.34        perf-profile.self.cycles-pp.___might_sleep
      0.26            -0.0        0.24 ±  2%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      1.10            +0.0        1.15        perf-profile.self.cycles-pp.__fpu__restore_sig
      0.28 ±  6%      +0.2        0.53 ±  5%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
     23.71            +0.3       23.96        perf-profile.self.cycles-pp.__sigqueue_alloc
     12.65            +0.6       13.24        perf-profile.self.cycles-pp.__sigqueue_free


                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  52000 +-+-----------------------------------------------------------------+   
        |..                                                                 |   
  51000 +-++.+..+.+                                                         |   
  50000 +-+        :                                                        |   
        |           :                                                       |   
  49000 +-+         :                                                       |   
        |            +..+.     .+.+..+.+..+..+.+..+.+..+..                  |   
  48000 +-+               +..+.                           +.+..+..+.+..+.+..|   
        |                            O    O  O O    O  O                    |   
  47000 +-+                            O          O       O O  O  O         |   
  46000 +-+                                                                 |   
        |                                                                   |   
  45000 +-+          O  O                                                   |   
        O  O O  O O       O  O  O O                                         |   
  44000 +-+-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.4.0-00001-gb77491648e6eb" of type "text/plain" (200664 bytes)

View attachment "job-script" of type "text/plain" (7883 bytes)

View attachment "job.yaml" of type "text/plain" (5468 bytes)

View attachment "reproduce" of type "text/plain" (311 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ