lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200630092313.GD5535@shao2-debian>
Date:   Tue, 30 Jun 2020 17:23:13 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Stephane Eranian <eranian@...gle.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Kim Phillips <kim.phillips@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [perf/x86/rapl] 16accae3d9: unixbench.score -4.1% regression

Greeting,

FYI, we noticed a -4.1% regression of unixbench.score due to commit:


commit: 16accae3d97f97d7f61c4ee5d0002bccdef59088 ("perf/x86/rapl: Fix RAPL config variable bug")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: unixbench
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	runtime: 300s
	nr_task: 30%
	test: context1
	cpufreq_governor: performance
	ucode: 0x5002f01

test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
test-url: https://github.com/kdlucas/byte-unixbench



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-7.6/30%/debian-x86_64-20191114.cgz/300s/lkp-csl-2ap2/context1/unixbench/0x5002f01

commit: 
  4e909124f8 (" Clean up various aspects of the vDSO code, no change in")
  16accae3d9 ("perf/x86/rapl: Fix RAPL config variable bug")

4e909124f8ed54b1 16accae3d97f97d7f61c4ee5d00 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2667            -4.1%       2558        unixbench.score
    102.78            -3.6%      99.07        unixbench.time.user_time
   3.1e+08            -4.3%  2.967e+08        unixbench.time.voluntary_context_switches
 4.177e+08            -4.1%  4.004e+08        unixbench.workload
      0.00 ± 44%      -0.0        0.00 ± 89%  mpstat.cpu.all.soft%
     13706 ±153%     -97.3%     374.50 ± 32%  softirqs.CPU13.NET_RX
   3146596            -4.3%    3012265        vmstat.system.cs
  4.33e+08           -12.8%  3.777e+08        cpuidle.C1.usage
 2.488e+08 ±  4%     +17.9%  2.933e+08 ±  9%  cpuidle.C1E.usage
     25592            +1.4%      25962        proc-vmstat.nr_slab_reclaimable
     73992            +1.7%      75243        proc-vmstat.nr_slab_unreclaimable
      1502 ±  8%      -8.7%       1371 ±  5%  sched_debug.cfs_rq:/.runnable_avg.max
   1197650 ± 18%     +64.0%    1964026 ± 17%  sched_debug.cfs_rq:/.spread0.avg
   3114959 ± 13%     +36.4%    4247863 ±  7%  sched_debug.cfs_rq:/.spread0.max
      2131 ±  4%     +18.6%       2529 ±  2%  slabinfo.UNIX.active_objs
      2131 ±  4%     +18.6%       2529 ±  2%  slabinfo.UNIX.num_objs
      3881 ±  4%     +13.2%       4394 ±  3%  slabinfo.sock_inode_cache.active_objs
      3881 ±  4%     +13.2%       4394 ±  3%  slabinfo.sock_inode_cache.num_objs
      1256 ±  7%     +14.7%       1441 ±  2%  slabinfo.task_group.active_objs
      1256 ±  7%     +14.7%       1441 ±  2%  slabinfo.task_group.num_objs
     82214 ± 10%     -12.3%      72134 ± 11%  numa-vmstat.node0.nr_unevictable
     82214 ± 10%     -12.3%      72134 ± 11%  numa-vmstat.node0.nr_zone_unevictable
    841.25 ±100%    +295.6%       3327 ± 35%  numa-vmstat.node2.nr_inactive_anon
      1064 ± 80%    +243.0%       3652 ± 30%  numa-vmstat.node2.nr_shmem
      4706 ± 33%     +56.4%       7358 ± 21%  numa-vmstat.node2.nr_slab_reclaimable
     16156 ± 10%     +27.0%      20519 ±  8%  numa-vmstat.node2.nr_slab_unreclaimable
    841.25 ±100%    +295.6%       3327 ± 35%  numa-vmstat.node2.nr_zone_inactive_anon
    328859 ± 10%     -12.3%     288537 ± 11%  numa-meminfo.node0.Unevictable
      3432 ±101%    +287.9%      13313 ± 35%  numa-meminfo.node2.Inactive
      3366 ±100%    +295.4%      13312 ± 35%  numa-meminfo.node2.Inactive(anon)
     18824 ± 33%     +56.4%      29438 ± 21%  numa-meminfo.node2.KReclaimable
     18824 ± 33%     +56.4%      29438 ± 21%  numa-meminfo.node2.SReclaimable
     64624 ± 10%     +27.0%      82084 ±  8%  numa-meminfo.node2.SUnreclaim
      4259 ± 80%    +243.0%      14610 ± 30%  numa-meminfo.node2.Shmem
     83450 ± 15%     +33.6%     111523 ±  7%  numa-meminfo.node2.Slab
     24989 ±153%     -97.6%     596.50 ± 36%  interrupts.34:PCI-MSI.524292-edge.eth0-TxRx-3
     85471 ±  2%     -15.6%      72153 ±  9%  interrupts.CPU0.RES:Rescheduling_interrupts
    110378 ±  4%      -6.9%     102726 ±  6%  interrupts.CPU103.RES:Rescheduling_interrupts
     53.75 ±117%    +514.4%     330.25 ± 59%  interrupts.CPU121.TLB:TLB_shootdowns
     24989 ±153%     -97.6%     596.50 ± 36%  interrupts.CPU13.34:PCI-MSI.524292-edge.eth0-TxRx-3
     16.25 ± 74%   +1889.2%     323.25 ± 88%  interrupts.CPU133.TLB:TLB_shootdowns
     48.00 ±105%    +597.4%     334.75 ±105%  interrupts.CPU136.TLB:TLB_shootdowns
    104.00 ±110%    +127.2%     236.25 ± 38%  interrupts.CPU139.TLB:TLB_shootdowns
     17.00 ± 34%    +525.0%     106.25 ±123%  interrupts.CPU143.TLB:TLB_shootdowns
     98102 ±  4%      +6.1%     104055 ±  3%  interrupts.CPU150.RES:Rescheduling_interrupts
     90645 ±  4%      +9.8%      99492 ±  4%  interrupts.CPU158.RES:Rescheduling_interrupts
     88524 ±  3%      +8.5%      96054 ±  4%  interrupts.CPU162.RES:Rescheduling_interrupts
     80176 ±  4%     +12.5%      90225 ±  6%  interrupts.CPU167.RES:Rescheduling_interrupts
    125.00 ± 60%    +255.2%     444.00 ± 31%  interrupts.CPU171.TLB:TLB_shootdowns
      2638 ± 21%     +39.9%       3692 ± 15%  interrupts.CPU172.NMI:Non-maskable_interrupts
      2638 ± 21%     +39.9%       3692 ± 15%  interrupts.CPU172.PMI:Performance_monitoring_interrupts
      2689 ± 29%     +40.6%       3782 ±  3%  interrupts.CPU179.NMI:Non-maskable_interrupts
      2689 ± 29%     +40.6%       3782 ±  3%  interrupts.CPU179.PMI:Performance_monitoring_interrupts
     21.75 ± 33%    +452.9%     120.25 ± 98%  interrupts.CPU179.TLB:TLB_shootdowns
      2663 ± 17%     +36.9%       3644 ±  4%  interrupts.CPU180.NMI:Non-maskable_interrupts
      2663 ± 17%     +36.9%       3644 ±  4%  interrupts.CPU180.PMI:Performance_monitoring_interrupts
      3154 ±  6%     -32.8%       2120 ± 22%  interrupts.CPU27.NMI:Non-maskable_interrupts
      3154 ±  6%     -32.8%       2120 ± 22%  interrupts.CPU27.PMI:Performance_monitoring_interrupts
      3229 ±  6%     -31.3%       2217 ± 30%  interrupts.CPU28.NMI:Non-maskable_interrupts
      3229 ±  6%     -31.3%       2217 ± 30%  interrupts.CPU28.PMI:Performance_monitoring_interrupts
      3393           -33.2%       2268 ± 29%  interrupts.CPU29.NMI:Non-maskable_interrupts
      3393           -33.2%       2268 ± 29%  interrupts.CPU29.PMI:Performance_monitoring_interrupts
    120.00 ±110%    +202.1%     362.50 ± 79%  interrupts.CPU30.TLB:TLB_shootdowns
      3446 ±  7%     -31.4%       2364 ± 32%  interrupts.CPU32.NMI:Non-maskable_interrupts
      3446 ±  7%     -31.4%       2364 ± 32%  interrupts.CPU32.PMI:Performance_monitoring_interrupts
     36.00 ± 75%    +440.3%     194.50 ± 93%  interrupts.CPU33.TLB:TLB_shootdowns
    219.00 ± 61%    +182.4%     618.50 ± 36%  interrupts.CPU35.TLB:TLB_shootdowns
      3418 ±  6%     -39.0%       2084 ± 25%  interrupts.CPU39.NMI:Non-maskable_interrupts
      3418 ±  6%     -39.0%       2084 ± 25%  interrupts.CPU39.PMI:Performance_monitoring_interrupts
     32.75 ± 83%    +399.2%     163.50 ± 75%  interrupts.CPU40.TLB:TLB_shootdowns
    112.50 ±136%    +279.3%     426.75 ± 52%  interrupts.CPU41.TLB:TLB_shootdowns
    683.75 ± 32%    +128.4%       1561 ± 38%  interrupts.CPU50.TLB:TLB_shootdowns
    534.50 ± 43%     +96.5%       1050 ± 30%  interrupts.CPU51.TLB:TLB_shootdowns
     67.50 ± 82%    +265.9%     247.00 ± 46%  interrupts.CPU70.TLB:TLB_shootdowns
    110.00 ± 74%    +326.8%     469.50 ± 31%  interrupts.CPU78.TLB:TLB_shootdowns
     98633 ±  4%     -13.4%      85388 ±  3%  interrupts.CPU79.RES:Rescheduling_interrupts
     40.87            -0.7       40.20        perf-profile.calltrace.cycles-pp.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     40.77            -0.7       40.10        perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry.start_secondary
     43.79            -0.7       43.12        perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
     43.79            -0.7       43.13        perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
     43.77            -0.7       43.11        perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
     44.05            -0.6       43.42        perf-profile.calltrace.cycles-pp.secondary_startup_64
      1.45 ±  7%      -0.2        1.30 ±  4%  perf-profile.calltrace.cycles-pp.__irqentry_text_start.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
      1.09 ±  2%      -0.1        1.04 ±  2%  perf-profile.calltrace.cycles-pp.stack_trace_save_tsk.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity.enqueue_task_fair
      0.60 ±  2%      -0.0        0.56 ±  2%  perf-profile.calltrace.cycles-pp.unwind_next_frame.arch_stack_walk.stack_trace_save_tsk.__account_scheduler_latency.update_stats_enqueue_sleeper
      0.50            +0.0        0.54 ±  4%  perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__sched_text_start.schedule.pipe_read
     47.21            +0.6       47.84        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity.enqueue_task_fair
     46.71            +0.6       47.35        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__account_scheduler_latency.update_stats_enqueue_sleeper.enqueue_entity
     43.79            -0.7       43.13        perf-profile.children.cycles-pp.start_secondary
     41.12            -0.6       40.48        perf-profile.children.cycles-pp.cpuidle_enter
     41.11            -0.6       40.47        perf-profile.children.cycles-pp.cpuidle_enter_state
     44.05            -0.6       43.42        perf-profile.children.cycles-pp.do_idle
     44.05            -0.6       43.42        perf-profile.children.cycles-pp.secondary_startup_64
     44.05            -0.6       43.42        perf-profile.children.cycles-pp.cpu_startup_entry
      0.10 ± 16%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.tick_irq_enter
      1.36            -0.1        1.30        perf-profile.children.cycles-pp.stack_trace_save_tsk
      1.17            -0.1        1.12        perf-profile.children.cycles-pp.arch_stack_walk
      0.77            -0.1        0.72        perf-profile.children.cycles-pp.unwind_next_frame
      0.12 ± 17%      -0.0        0.07 ± 21%  perf-profile.children.cycles-pp.irq_enter
      0.28            -0.0        0.25        perf-profile.children.cycles-pp.select_task_rq_fair
      0.15 ±  4%      -0.0        0.13 ±  5%  perf-profile.children.cycles-pp.sched_clock_cpu
      0.12            -0.0        0.11 ±  4%  perf-profile.children.cycles-pp.update_ts_time_stats
      0.10 ±  5%      +0.0        0.11        perf-profile.children.cycles-pp.stack_trace_consume_entry_nosched
      0.21 ±  3%      +0.0        0.23 ±  2%  perf-profile.children.cycles-pp.__switch_to
      0.22            +0.0        0.24 ±  2%  perf-profile.children.cycles-pp.reweight_entity
      0.03 ±100%      +0.0        0.06        perf-profile.children.cycles-pp.kill_fasync
      0.07 ±  6%      +0.0        0.11 ±  4%  perf-profile.children.cycles-pp._raw_spin_trylock
      0.10 ±  5%      +0.0        0.14        perf-profile.children.cycles-pp.rebalance_domains
      0.38 ±  3%      +0.1        0.43 ±  3%  perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
     47.61            +0.6       48.23        perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     47.40            +0.6       48.05        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
      0.49            -0.0        0.45 ±  3%  perf-profile.self.cycles-pp.unwind_next_frame
      0.22 ±  4%      -0.0        0.19 ± 12%  perf-profile.self.cycles-pp.ktime_get
      0.35            -0.0        0.34 ±  2%  perf-profile.self.cycles-pp.set_next_entity
      0.22            +0.0        0.24        perf-profile.self.cycles-pp.reweight_entity
      0.04 ± 57%      +0.0        0.07 ±  6%  perf-profile.self.cycles-pp.stack_trace_consume_entry_nosched
      0.07 ±  6%      +0.0        0.11 ±  4%  perf-profile.self.cycles-pp._raw_spin_trylock
      0.01 ±173%      +0.0        0.06        perf-profile.self.cycles-pp.kill_fasync
     47.40            +0.6       48.05        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
 7.523e+09            -2.8%  7.309e+09        perf-stat.i.branch-instructions
      6.21 ± 28%      +2.1        8.35 ± 20%  perf-stat.i.cache-miss-rate%
 5.942e+08 ±  2%      -6.0%  5.587e+08 ±  2%  perf-stat.i.cache-references
   3162168            -4.3%    3025917        perf-stat.i.context-switches
 1.546e+11            -1.0%  1.531e+11        perf-stat.i.cpu-cycles
    984728 ± 44%     -54.8%     444929 ± 90%  perf-stat.i.dTLB-load-misses
 8.643e+09            -3.0%  8.384e+09        perf-stat.i.dTLB-loads
    142857 ± 45%     -55.8%      63208 ± 98%  perf-stat.i.dTLB-store-misses
 3.804e+09            -4.1%  3.647e+09        perf-stat.i.dTLB-stores
  44849258            +2.8%   46101148        perf-stat.i.iTLB-load-misses
  23588895            -4.2%   22608828        perf-stat.i.iTLB-loads
 3.325e+10            -3.0%  3.226e+10        perf-stat.i.instructions
    785.28 ±  2%      -9.5%     710.79        perf-stat.i.instructions-per-iTLB-miss
      0.81            -1.0%       0.80        perf-stat.i.metric.GHz
      1.09 ±  3%     -35.7%       0.70 ±  4%  perf-stat.i.metric.K/sec
    107.30            -3.2%     103.83        perf-stat.i.metric.M/sec
     96.69            +1.4       98.10        perf-stat.i.node-load-miss-rate%
   5946250           +10.6%    6577000        perf-stat.i.node-load-misses
    114205 ±  2%     -54.1%      52372 ±  4%  perf-stat.i.node-loads
   5431765            -2.9%    5274750        perf-stat.i.node-store-misses
     36683 ±  7%     -22.3%      28520 ± 13%  perf-stat.i.node-stores
     17.87 ±  2%      -3.1%      17.31 ±  2%  perf-stat.overall.MPKI
      5.52 ±  2%      +0.3        5.84 ±  2%  perf-stat.overall.cache-miss-rate%
      4.65            +2.0%       4.74        perf-stat.overall.cpi
      0.01 ± 44%      -0.0        0.01 ± 90%  perf-stat.overall.dTLB-load-miss-rate%
     65.53            +1.6       67.09        perf-stat.overall.iTLB-load-miss-rate%
    741.35            -5.6%     699.89        perf-stat.overall.instructions-per-iTLB-miss
      0.22            -2.0%       0.21        perf-stat.overall.ipc
     98.11            +1.1       99.21        perf-stat.overall.node-load-miss-rate%
     31153            +1.3%      31555        perf-stat.overall.path-length
 7.501e+09            -2.8%   7.29e+09        perf-stat.ps.branch-instructions
 5.925e+08 ±  2%      -6.0%  5.572e+08 ±  2%  perf-stat.ps.cache-references
   3152559            -4.3%    3018237        perf-stat.ps.context-switches
 1.541e+11            -0.9%  1.527e+11        perf-stat.ps.cpu-cycles
    984014 ± 44%     -54.8%     444656 ± 90%  perf-stat.ps.dTLB-load-misses
 8.617e+09            -3.0%  8.362e+09        perf-stat.ps.dTLB-loads
    142770 ± 45%     -55.8%      63169 ± 98%  perf-stat.ps.dTLB-store-misses
 3.792e+09            -4.1%  3.638e+09        perf-stat.ps.dTLB-stores
  44712199            +2.8%   45980702        perf-stat.ps.iTLB-load-misses
  23519911            -4.1%   22554062        perf-stat.ps.iTLB-loads
 3.315e+10            -2.9%  3.218e+10        perf-stat.ps.instructions
   5927959           +10.7%    6559697        perf-stat.ps.node-load-misses
    113911 ±  2%     -54.1%      52287 ±  4%  perf-stat.ps.node-loads
   5415049            -2.8%    5260809        perf-stat.ps.node-store-misses
     36616 ±  7%     -22.3%      28468 ± 13%  perf-stat.ps.node-stores
 1.301e+13            -2.9%  1.264e+13        perf-stat.total.instructions


                                                                                
                                  unixbench.score                               
                                                                                
  2700 +--------------------------------------------------------------------+   
       |.+..                                                      .+.+.+..  |   
  2650 |-+  +.+.+..+.+.+.+.. .+.    .+.+..+.+.+.+..+.           .+        +.|   
       |                    +   +..+                 +.  .+.+.+.            |   
       |                                               +.                   |   
  2600 |-+                                                                  |   
       |                        O  O      O O                               |   
  2550 |-+                  O O      O O                                    |   
       |                                                                    |   
  2500 |-+                                                                  |   
       |                 O                                                  |   
       | O  O   O  O O O                                                    |   
  2450 |-+    O                                                             |   
       |                                                                    |   
  2400 +--------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.7.0-00916-g16accae3d97f9" of type "text/plain" (202910 bytes)

View attachment "job-script" of type "text/plain" (7468 bytes)

View attachment "job.yaml" of type "text/plain" (5014 bytes)

View attachment "reproduce" of type "text/plain" (294 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ