lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Mon, 2 Mar 2020 09:14:27 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Alex Kogan <alex.kogan@...cle.com>
Cc:     linux@...linux.org.uk, peterz@...radead.org, mingo@...hat.com,
        will.deacon@....com, arnd@...db.de, longman@...hat.com,
        linux-arch@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, tglx@...utronix.de, bp@...en8.de,
        hpa@...or.com, x86@...nel.org, guohanjun@...wei.com,
        jglauber@...vell.com, steven.sistare@...cle.com,
        daniel.m.jordan@...cle.com, alex.kogan@...cle.com,
        dave.dice@...cle.com, lkp@...ts.01.org
Subject: [locking/qspinlock] 7b6da71157: unixbench.score 8.4% improvement

Greeting,

FYI, we noticed a 8.4% improvement of unixbench.score due to commit:


commit: 7b6da7115786ee28ad82638a5dcb2ec1ffda0e96 ("[PATCH v9 5/5] locking/qspinlock: Introduce the shuffle reduction optimization into CNA")
url: https://github.com/0day-ci/linux/commits/Alex-Kogan/Add-NUMA-awareness-to-qspinlock/20200116-161727


in testcase: unixbench
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	runtime: 300s
	nr_task: 30%
	test: context1
	cpufreq_governor: performance
	ucode: 0x500002c

test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
test-url: https://github.com/kdlucas/byte-unixbench

In addition to that, the commit also has significant impact on the following tests:



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.6/30%/debian-x86_64-20191114.cgz/300s/lkp-csl-2ap3/context1/unixbench/0x500002c

commit: 
  dfce1eb694 ("locking/qspinlock: Introduce starvation avoidance into CNA")
  7b6da71157 ("locking/qspinlock: Introduce the shuffle reduction optimization into CNA")

dfce1eb694321530 7b6da7115786ee28ad82638a5dc 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:4          -25%            :4     kmsg.ipmi_si_dmi-ipmi-si.#:IRQ_index#not_found
           :4           25%           1:4     dmesg.WARNING:at#for_ip_swapgs_restore_regs_and_return_to_usermode/0x
           :4           50%           2:4     dmesg.WARNING:stack_recursion
         %stddev     %change         %stddev
             \          |                \  
      2659            +8.4%       2883        unixbench.score
      4016            +1.9%       4092        unixbench.time.percent_of_cpu_this_job_got
     15666            +1.6%      15923        unixbench.time.system_time
    109.72 ±  2%     +10.1%     120.86 ±  2%  unixbench.time.user_time
 3.053e+08            +9.3%  3.336e+08        unixbench.time.voluntary_context_switches
 4.175e+08            +8.2%  4.515e+08        unixbench.workload
    111361 ±  3%      -7.8%     102620        softirqs.CPU89.SCHED
   3085234            +9.7%    3384659        vmstat.system.cs
     35.88 ±  8%     -12.8%      31.28 ±  4%  boot-time.boot
     28.29 ±  8%     -15.4%      23.92        boot-time.dhcp
      5943 ±  9%     -14.6%       5073 ±  4%  boot-time.idle
      5259 ± 78%     -80.3%       1035 ± 36%  numa-meminfo.node2.Inactive
      5132 ± 80%     -79.8%       1035 ± 36%  numa-meminfo.node2.Inactive(anon)
      6664 ± 66%     -71.7%       1883 ± 54%  numa-meminfo.node2.Shmem
      1282 ± 80%     -79.8%     258.50 ± 36%  numa-vmstat.node2.nr_inactive_anon
      1665 ± 66%     -71.8%     470.50 ± 54%  numa-vmstat.node2.nr_shmem
      1282 ± 80%     -79.8%     258.50 ± 36%  numa-vmstat.node2.nr_zone_inactive_anon
  2.05e+08 ±  2%     +28.2%  2.628e+08 ±  2%  turbostat.C1
      5.52 ±  5%      +2.5        7.99        turbostat.C1%
     16.59 ± 41%     +12.8       29.43        turbostat.C1E%
  17973386 ± 22%     -98.8%     211437 ± 84%  turbostat.C6
     15.45 ± 46%     -15.2        0.26 ± 96%  turbostat.C6%
  8.15e+09 ±  5%     +44.4%  1.177e+10        cpuidle.C1.time
 4.037e+08 ±  4%     +27.2%  5.134e+08        cpuidle.C1.usage
 2.454e+10 ± 40%     +77.0%  4.343e+10        cpuidle.C1E.time
 2.308e+10 ± 46%     -97.7%  5.325e+08 ± 66%  cpuidle.C6.time
  35252918 ± 22%     -98.8%     436570 ± 71%  cpuidle.C6.usage
 1.829e+08 ± 10%     -46.2%   98462878        cpuidle.POLL.time
  14828999 ± 20%     -42.2%    8563858 ±  4%  cpuidle.POLL.usage
    309808 ± 14%     -22.3%     240665 ± 13%  sched_debug.cfs_rq:/.load.max
     33894 ± 12%     -28.1%      24377 ± 10%  sched_debug.cfs_rq:/.load.stddev
      0.45 ±  7%     +13.5%       0.51 ±  2%  sched_debug.cfs_rq:/.nr_running.avg
      0.33 ± 69%     -55.3%       0.15 ± 14%  sched_debug.cfs_rq:/.nr_spread_over.avg
    309558 ± 14%     -22.7%     239243 ± 12%  sched_debug.cfs_rq:/.runnable_weight.max
     33905 ± 12%     -28.5%      24244 ± 10%  sched_debug.cfs_rq:/.runnable_weight.stddev
    419.84 ±  8%     +14.8%     482.01        sched_debug.cfs_rq:/.util_est_enqueued.avg
    521161 ±  3%     -14.9%     443373 ±  5%  sched_debug.cpu.avg_idle.avg
   1410729 ± 15%     -27.3%    1025358        sched_debug.cpu.avg_idle.max
    740422 ± 17%     -30.7%     513046        sched_debug.cpu.max_idle_balance_cost.max
     35777 ± 50%     -91.3%       3128 ± 62%  sched_debug.cpu.max_idle_balance_cost.stddev
   3760337 ±  3%     +11.4%    4188170 ±  2%  sched_debug.cpu.nr_switches.max
   3756863 ±  3%     +11.4%    4184620 ±  2%  sched_debug.cpu.sched_count.max
   1877767 ±  3%     +11.4%    2091800 ±  2%  sched_debug.cpu.sched_goidle.max
   1879095 ±  3%     +11.4%    2093403 ±  2%  sched_debug.cpu.ttwu_count.max
      0.79 ±  8%      -0.2        0.55        perf-profile.calltrace.cycles-pp.__cna_queued_spin_lock_slowpath._raw_spin_lock.scheduler_tick.update_process_times.tick_sched_handle
      0.80 ±  8%      -0.2        0.57        perf-profile.calltrace.cycles-pp._raw_spin_lock.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer
      0.84 ±  8%      -0.2        0.62        perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues
      1.09 ±  4%      -0.2        0.90 ±  4%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
      0.92 ±  6%      -0.2        0.72 ±  4%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
      0.92 ±  5%      -0.2        0.73 ±  4%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
      0.96 ±  7%      -0.2        0.79 ±  5%  perf-profile.calltrace.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      0.43 ±  4%      -0.4        0.05        perf-profile.children.cycles-pp.cna_scan_main_queue
      1.02 ±  8%      -0.2        0.81        perf-profile.children.cycles-pp._raw_spin_lock
      0.91 ±  7%      -0.2        0.70        perf-profile.children.cycles-pp.scheduler_tick
      1.22 ±  3%      -0.2        1.03 ±  3%  perf-profile.children.cycles-pp.__hrtimer_run_queues
      1.01 ±  5%      -0.2        0.82 ±  3%  perf-profile.children.cycles-pp.tick_sched_handle
      1.00 ±  4%      -0.2        0.82 ±  4%  perf-profile.children.cycles-pp.update_process_times
      0.47 ± 19%      -0.2        0.30 ± 25%  perf-profile.children.cycles-pp.poll_idle
      1.06 ±  6%      -0.2        0.90 ±  4%  perf-profile.children.cycles-pp.tick_sched_timer
      0.08            +0.0        0.09        perf-profile.children.cycles-pp.cpus_share_cache
      0.14 ±  5%      +0.0        0.15 ±  3%  perf-profile.children.cycles-pp.sched_clock
      0.06 ± 11%      +0.0        0.08 ±  6%  perf-profile.children.cycles-pp.fsnotify
      0.14 ±  3%      +0.0        0.15 ±  2%  perf-profile.children.cycles-pp.update_cfs_group
      0.14 ±  6%      +0.0        0.15 ±  3%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      0.11 ±  6%      +0.0        0.13 ±  6%  perf-profile.children.cycles-pp.nr_iowait_cpu
      0.14 ±  3%      +0.0        0.16 ±  5%  perf-profile.children.cycles-pp.select_idle_sibling
      0.11 ±  7%      +0.0        0.13 ±  3%  perf-profile.children.cycles-pp.update_ts_time_stats
      0.18 ±  7%      +0.0        0.21 ±  4%  perf-profile.children.cycles-pp.tick_nohz_idle_exit
      0.21 ±  7%      +0.0        0.24        perf-profile.children.cycles-pp.mutex_unlock
      0.24 ±  5%      +0.0        0.27        perf-profile.children.cycles-pp.update_rq_clock
      0.26 ±  6%      +0.0        0.29 ±  3%  perf-profile.children.cycles-pp.select_task_rq_fair
      0.35 ±  5%      +0.0        0.39 ±  3%  perf-profile.children.cycles-pp._raw_spin_unlock_irqrestore
      0.07 ± 17%      +0.0        0.11 ± 14%  perf-profile.children.cycles-pp._raw_spin_trylock
      0.38 ±  2%      +0.0        0.41 ±  3%  perf-profile.children.cycles-pp.mutex_lock
      0.16 ±  6%      +0.1        0.22 ±  3%  perf-profile.children.cycles-pp.reweight_entity
      0.19 ± 30%      +0.1        0.28 ± 10%  perf-profile.children.cycles-pp.clockevents_program_event
      0.42 ± 21%      +0.1        0.55 ±  6%  perf-profile.children.cycles-pp.ktime_get
      0.42 ±  5%      -0.4        0.05        perf-profile.self.cycles-pp.cna_scan_main_queue
      0.41 ± 21%      -0.2        0.23 ± 31%  perf-profile.self.cycles-pp.poll_idle
      0.12 ±  3%      +0.0        0.14 ±  3%  perf-profile.self.cycles-pp.__wake_up_common
      0.06 ± 11%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.__unwind_start
      0.06 ± 14%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.fsnotify
      0.09 ±  5%      +0.0        0.11 ±  4%  perf-profile.self.cycles-pp.__account_scheduler_latency
      0.21 ±  8%      +0.0        0.23 ±  3%  perf-profile.self.cycles-pp.mutex_unlock
      0.25 ±  5%      +0.0        0.28 ±  2%  perf-profile.self.cycles-pp.stack_trace_save_tsk
      0.07 ± 17%      +0.0        0.11 ± 14%  perf-profile.self.cycles-pp._raw_spin_trylock
      0.23 ±  4%      +0.0        0.27 ±  3%  perf-profile.self.cycles-pp.enqueue_entity
      0.01 ±173%      +0.0        0.06 ±  9%  perf-profile.self.cycles-pp.tick_nohz_next_event
      0.16 ±  7%      +0.1        0.21 ±  4%  perf-profile.self.cycles-pp.reweight_entity
      0.32 ± 25%      +0.1        0.44 ±  6%  perf-profile.self.cycles-pp.ktime_get
      0.61 ±  4%      +0.2        0.78        perf-profile.self.cycles-pp._raw_spin_lock_irqsave
 7.467e+09            +4.5%    7.8e+09        perf-stat.i.branch-instructions
      1.52 ± 52%      -0.5        1.04        perf-stat.i.branch-miss-rate%
  18876088 ±  4%     +37.6%   25967363        perf-stat.i.cache-misses
 5.782e+08            +9.7%   6.34e+08        perf-stat.i.cache-references
   3117860            +9.3%    3407692        perf-stat.i.context-switches
      4.95 ±  5%      -9.5%       4.48        perf-stat.i.cpi
    950.94 ± 11%     -14.8%     809.79 ±  3%  perf-stat.i.cpu-migrations
      8478 ±  5%     -34.8%       5528        perf-stat.i.cycles-between-cache-misses
      0.10 ± 66%      -0.1        0.00 ±  9%  perf-stat.i.dTLB-load-miss-rate%
    797793 ± 30%     -68.3%     252660 ±  4%  perf-stat.i.dTLB-load-misses
 8.708e+09            +4.7%  9.116e+09        perf-stat.i.dTLB-loads
      0.03 ± 64%      -0.0        0.00 ±  2%  perf-stat.i.dTLB-store-miss-rate%
    124658 ± 30%     -74.0%      32385 ±  4%  perf-stat.i.dTLB-store-misses
 3.774e+09            +9.1%  4.118e+09        perf-stat.i.dTLB-stores
     64.91 ±  4%      -2.3       62.61        perf-stat.i.iTLB-load-miss-rate%
  43785961 ±  2%     +11.7%   48914532        perf-stat.i.iTLB-load-misses
  23149859 ±  4%      +8.6%   25133067        perf-stat.i.iTLB-loads
 3.291e+10            +4.8%  3.448e+10        perf-stat.i.instructions
    738.35 ±  2%      -5.5%     697.64        perf-stat.i.instructions-per-iTLB-miss
      0.21 ±  2%      +8.0%       0.22        perf-stat.i.ipc
     84.65           +10.4       95.02        perf-stat.i.node-load-miss-rate%
   4397398            +8.0%    4748512        perf-stat.i.node-load-misses
    847789           -73.3%     226176        perf-stat.i.node-loads
     92.33            +5.7       98.01        perf-stat.i.node-store-miss-rate%
   2046078           +98.1%    4054203        perf-stat.i.node-store-misses
     99069           -63.5%      36118        perf-stat.i.node-stores
     17.58 ±  2%      +4.6%      18.39        perf-stat.overall.MPKI
      0.97 ±  2%      -0.1        0.92        perf-stat.overall.branch-miss-rate%
      3.26 ±  5%      +0.8        4.10        perf-stat.overall.cache-miss-rate%
      4.75            -5.2%       4.50        perf-stat.overall.cpi
      8290 ±  3%     -27.9%       5974        perf-stat.overall.cycles-between-cache-misses
      0.01 ± 32%      -0.0        0.00 ±  4%  perf-stat.overall.dTLB-load-miss-rate%
      0.00 ± 33%      -0.0        0.00 ±  5%  perf-stat.overall.dTLB-store-miss-rate%
    751.69            -6.2%     704.91        perf-stat.overall.instructions-per-iTLB-miss
      0.21            +5.5%       0.22        perf-stat.overall.ipc
     83.84           +11.6       95.45        perf-stat.overall.node-load-miss-rate%
     95.38            +3.7       99.12        perf-stat.overall.node-store-miss-rate%
     30797            -3.4%      29764        perf-stat.overall.path-length
  7.45e+09            +4.4%  7.782e+09        perf-stat.ps.branch-instructions
  18821819 ±  4%     +37.6%   25906176        perf-stat.ps.cache-misses
 5.769e+08            +9.6%  6.325e+08        perf-stat.ps.cache-references
   3111288            +9.3%    3399507        perf-stat.ps.context-switches
    948.06 ± 11%     -14.8%     807.87 ±  3%  perf-stat.ps.cpu-migrations
    796174 ± 31%     -68.3%     252476 ±  4%  perf-stat.ps.dTLB-load-misses
 8.688e+09            +4.7%  9.095e+09        perf-stat.ps.dTLB-loads
    124340 ± 31%     -74.0%      32349 ±  4%  perf-stat.ps.dTLB-store-misses
 3.766e+09            +9.1%  4.108e+09        perf-stat.ps.dTLB-stores
  43689908 ±  2%     +11.7%   48794646        perf-stat.ps.iTLB-load-misses
  23103357 ±  4%      +8.5%   25074278        perf-stat.ps.iTLB-loads
 3.283e+10            +4.8%   3.44e+10        perf-stat.ps.instructions
   4385696            +8.0%    4738522        perf-stat.ps.node-load-misses
    845448           -73.3%     225752        perf-stat.ps.node-loads
   2040515           +98.3%    4045447        perf-stat.ps.node-store-misses
     98835           -63.5%      36074        perf-stat.ps.node-stores
 1.286e+13            +4.5%  1.344e+13        perf-stat.total.instructions
      2564 ± 21%     +36.9%       3510 ± 12%  interrupts.CPU0.TLB:TLB_shootdowns
     20.25 ± 44%   +1430.9%     310.00 ± 79%  interrupts.CPU10.TLB:TLB_shootdowns
    104.50 ±124%    +201.0%     314.50 ± 53%  interrupts.CPU105.TLB:TLB_shootdowns
     18.75 ± 35%    +418.7%      97.25 ±110%  interrupts.CPU108.TLB:TLB_shootdowns
     18.75 ± 29%    +909.3%     189.25 ± 65%  interrupts.CPU110.TLB:TLB_shootdowns
     18.75 ± 29%    +698.7%     149.75 ± 81%  interrupts.CPU113.TLB:TLB_shootdowns
     89357 ±  6%     +15.9%     103535 ±  5%  interrupts.CPU115.RES:Rescheduling_interrupts
     88209 ±  9%     +20.6%     106340 ±  5%  interrupts.CPU116.RES:Rescheduling_interrupts
     83630 ± 11%     +20.8%     101003 ±  2%  interrupts.CPU117.RES:Rescheduling_interrupts
     21.00 ± 36%    +702.4%     168.50 ± 69%  interrupts.CPU117.TLB:TLB_shootdowns
    110630 ±  8%     +10.8%     122586 ±  3%  interrupts.CPU120.RES:Rescheduling_interrupts
    106258 ±  9%     +14.3%     121460 ±  3%  interrupts.CPU122.RES:Rescheduling_interrupts
     41.75 ± 55%    +476.6%     240.75 ± 96%  interrupts.CPU123.TLB:TLB_shootdowns
     56.75 ±132%    +519.8%     351.75 ± 29%  interrupts.CPU124.TLB:TLB_shootdowns
     88.75 ±147%    +116.3%     192.00 ± 77%  interrupts.CPU128.TLB:TLB_shootdowns
     19.00 ± 27%   +1509.2%     305.75 ± 98%  interrupts.CPU129.TLB:TLB_shootdowns
    112.25 ±106%    +278.0%     424.25 ± 51%  interrupts.CPU13.TLB:TLB_shootdowns
     18.50 ± 18%    +839.2%     173.75 ± 73%  interrupts.CPU132.TLB:TLB_shootdowns
     92.25 ±142%    +499.7%     553.25 ± 98%  interrupts.CPU134.TLB:TLB_shootdowns
     88357 ±  7%      +9.9%      97128 ±  5%  interrupts.CPU138.RES:Rescheduling_interrupts
     89772 ±  9%     +18.5%     106341 ±  4%  interrupts.CPU139.RES:Rescheduling_interrupts
      2451 ± 32%     +66.6%       4084 ±  8%  interrupts.CPU140.NMI:Non-maskable_interrupts
      2451 ± 32%     +66.6%       4084 ±  8%  interrupts.CPU140.PMI:Performance_monitoring_interrupts
     88.25 ±143%    +210.8%     274.25 ± 88%  interrupts.CPU140.TLB:TLB_shootdowns
    107210 ±  8%     +12.3%     120362 ±  4%  interrupts.CPU144.RES:Rescheduling_interrupts
     96772 ±  7%     +11.1%     107553 ±  5%  interrupts.CPU145.RES:Rescheduling_interrupts
    102245 ±  2%     +10.5%     112975 ±  2%  interrupts.CPU146.RES:Rescheduling_interrupts
     21.00 ± 28%    +298.8%      83.75 ± 29%  interrupts.CPU146.TLB:TLB_shootdowns
      2359 ± 16%     +39.6%       3294 ± 18%  interrupts.CPU152.NMI:Non-maskable_interrupts
      2359 ± 16%     +39.6%       3294 ± 18%  interrupts.CPU152.PMI:Performance_monitoring_interrupts
     23.00 ± 39%   +1562.0%     382.25 ± 86%  interrupts.CPU159.TLB:TLB_shootdowns
     19.00 ± 34%    +635.5%     139.75 ± 80%  interrupts.CPU160.TLB:TLB_shootdowns
     20.50 ± 32%    +506.1%     124.25 ± 92%  interrupts.CPU161.TLB:TLB_shootdowns
     85527 ±  7%     +16.3%      99481        interrupts.CPU162.RES:Rescheduling_interrupts
     24.00 ± 39%    +415.6%     123.75 ±107%  interrupts.CPU165.TLB:TLB_shootdowns
     83285 ± 17%     +22.7%     102187 ±  9%  interrupts.CPU166.RES:Rescheduling_interrupts
    124.75 ± 99%    +173.5%     341.25 ± 66%  interrupts.CPU166.TLB:TLB_shootdowns
     54.75 ± 80%    +543.8%     352.50 ± 52%  interrupts.CPU168.TLB:TLB_shootdowns
     36.25 ± 64%    +510.3%     221.25 ± 66%  interrupts.CPU17.TLB:TLB_shootdowns
     66.75 ±112%    +205.6%     204.00 ± 71%  interrupts.CPU170.TLB:TLB_shootdowns
     91.00 ±127%    +446.4%     497.25 ± 29%  interrupts.CPU171.TLB:TLB_shootdowns
    108108 ±  7%     +10.1%     119063 ±  5%  interrupts.CPU173.RES:Rescheduling_interrupts
    103866 ±  5%     +19.7%     124283 ±  7%  interrupts.CPU175.RES:Rescheduling_interrupts
     86.50 ±148%    +430.9%     459.25 ± 26%  interrupts.CPU176.TLB:TLB_shootdowns
    100777 ± 10%     +13.6%     114467 ±  5%  interrupts.CPU180.RES:Rescheduling_interrupts
     87385 ± 11%     +24.5%     108773 ±  8%  interrupts.CPU182.RES:Rescheduling_interrupts
     89523 ± 10%     +22.9%     110062 ±  7%  interrupts.CPU183.RES:Rescheduling_interrupts
     18.75 ± 36%    +774.7%     164.00 ± 85%  interrupts.CPU183.TLB:TLB_shootdowns
     89589 ± 12%     +25.1%     112078 ± 11%  interrupts.CPU184.RES:Rescheduling_interrupts
     89109 ± 11%     +29.7%     115571 ±  6%  interrupts.CPU185.RES:Rescheduling_interrupts
     84744 ± 11%     +19.9%     101596 ±  9%  interrupts.CPU186.RES:Rescheduling_interrupts
     85476 ± 15%     +19.3%     101991 ±  5%  interrupts.CPU189.RES:Rescheduling_interrupts
    100.75 ±133%    +306.5%     409.50 ± 53%  interrupts.CPU19.TLB:TLB_shootdowns
     40.00 ± 29%    +303.1%     161.25 ± 51%  interrupts.CPU190.TLB:TLB_shootdowns
     91.75 ±137%    +186.6%     263.00 ± 45%  interrupts.CPU20.TLB:TLB_shootdowns
      3580 ± 16%     -18.4%       2920 ±  5%  interrupts.CPU22.NMI:Non-maskable_interrupts
      3580 ± 16%     -18.4%       2920 ±  5%  interrupts.CPU22.PMI:Performance_monitoring_interrupts
    132.75 ± 95%    +236.5%     446.75 ± 24%  interrupts.CPU23.TLB:TLB_shootdowns
    120.75 ±102%    +207.2%     371.00 ± 42%  interrupts.CPU29.TLB:TLB_shootdowns
     28.25 ± 27%   +1029.2%     319.00 ±116%  interrupts.CPU30.TLB:TLB_shootdowns
      3011 ± 18%     -27.0%       2199 ± 31%  interrupts.CPU31.NMI:Non-maskable_interrupts
      3011 ± 18%     -27.0%       2199 ± 31%  interrupts.CPU31.PMI:Performance_monitoring_interrupts
     49.50 ± 74%    +511.1%     302.50 ± 82%  interrupts.CPU32.TLB:TLB_shootdowns
     23.00 ± 33%    +825.0%     212.75 ±100%  interrupts.CPU33.TLB:TLB_shootdowns
      3077 ± 14%     +24.8%       3840 ±  5%  interrupts.CPU35.NMI:Non-maskable_interrupts
      3077 ± 14%     +24.8%       3840 ±  5%  interrupts.CPU35.PMI:Performance_monitoring_interrupts
     54.25 ± 88%    +285.7%     209.25 ± 68%  interrupts.CPU36.TLB:TLB_shootdowns
     45.75 ± 92%    +648.1%     342.25 ± 28%  interrupts.CPU38.TLB:TLB_shootdowns
     40.50 ± 69%    +529.6%     255.00 ± 44%  interrupts.CPU41.TLB:TLB_shootdowns
      2264 ± 15%     +58.7%       3593 ± 10%  interrupts.CPU46.NMI:Non-maskable_interrupts
      2264 ± 15%     +58.7%       3593 ± 10%  interrupts.CPU46.PMI:Performance_monitoring_interrupts
    274.50 ± 71%    +108.7%     572.75 ± 19%  interrupts.CPU51.TLB:TLB_shootdowns
     86524 ±  4%      +7.8%      93252 ±  4%  interrupts.CPU57.RES:Rescheduling_interrupts
     37.75 ± 30%   +1033.8%     428.00 ± 53%  interrupts.CPU60.TLB:TLB_shootdowns
     80.50 ±117%    +602.2%     565.25 ± 77%  interrupts.CPU66.TLB:TLB_shootdowns
     98.00 ±127%    +221.4%     315.00 ± 76%  interrupts.CPU69.TLB:TLB_shootdowns
    281.75 ± 29%     -69.7%      85.25 ±  5%  interrupts.CPU7.TLB:TLB_shootdowns
      3275 ±  6%     -22.6%       2535 ± 26%  interrupts.CPU72.NMI:Non-maskable_interrupts
      3275 ±  6%     -22.6%       2535 ± 26%  interrupts.CPU72.PMI:Performance_monitoring_interrupts
     51.00 ±115%    +383.3%     246.50 ± 44%  interrupts.CPU88.TLB:TLB_shootdowns
     23.50 ± 71%    +368.1%     110.00 ± 32%  interrupts.CPU90.TLB:TLB_shootdowns
     68.25 ± 38%    +231.5%     226.25 ± 58%  interrupts.CPU91.TLB:TLB_shootdowns
     59.25 ±102%    +230.4%     195.75 ± 75%  interrupts.CPU92.TLB:TLB_shootdowns
     48.00 ± 79%    +530.2%     302.50 ± 44%  interrupts.CPU95.TLB:TLB_shootdowns
    108205 ±  9%     +13.5%     122833 ±  4%  interrupts.CPU97.RES:Rescheduling_interrupts
     55463 ± 22%     +42.5%      79045 ± 14%  interrupts.TLB:TLB_shootdowns


                                                                                
                       unixbench.time.voluntary_context_switches                
                                                                                
  3.5e+08 +-----------------------------------------------------------------+   
          | O O OO                                                          |   
  3.4e+08 |-+            O OO O O O O                                       |   
          |        O O O              OO   O O O  O                         |   
  3.3e+08 |-+                            O       O                          |   
          |                                                                 |   
  3.2e+08 |-+                                                               |   
          |                              +.+.+.+                            |   
  3.1e+08 |.+.+.++.                     +      :                      .+    |   
          |        +.+.+.+.++.+.+.+.+.++        :                   .+  +.  |   
    3e+08 |-+                                   :           ++.+.+.+      +.|   
          |                                      :         +                |   
  2.9e+08 |-+                                    ++.     .+                 |   
          |                                         +.+.+                   |   
  2.8e+08 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                  unixbench.score                               
                                                                                
  3100 +--------------------------------------------------------------------+   
       |                                                                    |   
  3000 |-O O O O                                                            |   
       |                O         O                                         |   
  2900 |-+       O OO O   O O O O   O O   OO O   O                          |   
       |                                O      O                            |   
  2800 |-+                                                                  |   
       |                                                              +.    |   
  2700 |-+                             .+.++.+                       +  +   |   
       |.+.+. .+.+. +. .+.+.   .+.+. .+       :                   +.+    + .|   
  2600 |-+   +     +  +     +.+     +         :               .+. :       + |   
       |                                       :          .+.+   +          |   
  2500 |-+                                     +.+.     .+                  |   
       |                                           +.+.+                    |   
  2400 +--------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-skl-fpga01: 104 threads Skylake with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-skl-fpga01/mmap1/will-it-scale/0x2000065

commit: 
  dfce1eb694 ("locking/qspinlock: Introduce starvation avoidance into CNA")
  7b6da71157 ("locking/qspinlock: Introduce the shuffle reduction optimization into CNA")

dfce1eb694321530 7b6da7115786ee28ad82638a5dc 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:2          -50%            :2     dmesg.WARNING:at#for_ip_interrupt_entry/0x



***************************************************************************************************
lkp-skl-fpga01: 104 threads Skylake with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-20191114.cgz/lkp-skl-fpga01/mmap2/will-it-scale/0x2000065

commit: 
  dfce1eb694 ("locking/qspinlock: Introduce starvation avoidance into CNA")
  7b6da71157 ("locking/qspinlock: Introduce the shuffle reduction optimization into CNA")

dfce1eb694321530 7b6da7115786ee28ad82638a5dc 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:2          -50%            :2     dmesg.WARNING:at#for_ip_interrupt_entry/0x





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.4.0-04240-g7b6da7115786e" of type "text/plain" (201322 bytes)

View attachment "job-script" of type "text/plain" (7630 bytes)

View attachment "job.yaml" of type "text/plain" (5210 bytes)

View attachment "reproduce" of type "text/plain" (294 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ