lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Feb 2019 09:36:02 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Bob Moore <robert.moore@...el.com>
Cc:     "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        Erik Schmauss <erik.schmauss@...el.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>, lkp@...org
Subject: [LKP] [ACPICA]  4c1379d7bb:  reaim.jobs_per_min -2.8% regression

Greeting,

FYI, we noticed a -2.8% regression of reaim.jobs_per_min due to commit:


commit: 4c1379d7bb42fa664e0784539208ed74108c53f1 ("ACPICA: Debug output: Add option to display method/object evaluation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: ftq
on test machine: 256 threads Phi with 96G memory
with following parameters:

	nr_task: 100%
	samples: 6000ss
	test: cache
	freq: 20
	cpufreq_governor: performance

test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective.
test-url: https://github.com/rminnich/ftq

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------+
| testcase: change | perf_test: perf_test.msr_test.round1.cpu_util_zero.pass -100.0% undefined |
| test machine     | 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory     |
| test parameters  | type=lkp                                                                  |
|                  | ucode=0x3d                                                                |
+------------------+---------------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_job/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.2/1500/100%/debian-x86_64-2018-04-03.cgz/300s/lkp-hsw-ep2/all_utime/reaim/0x3d

commit: 
  73a049a90f ("ACPICA: disassembler: disassemble OEMx tables as AML")
  4c1379d7bb ("ACPICA: Debug output: Add option to display method/object evaluation")

73a049a90fb241af 4c1379d7bb42fa664e07845392 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.07 ±  3%     +20.2%       0.09 ± 12%  reaim.child_systime
    308866            -2.8%     300200        reaim.jobs_per_min
      4289            -2.8%       4169        reaim.jobs_per_min_child
    314258            -2.7%     305679        reaim.max_jobs_per_min
     20.93            +2.9%      21.53        reaim.parent_time
      0.44 ±  9%     +84.5%       0.82 ±  6%  reaim.std_dev_percent
      0.09 ±  9%     +83.7%       0.17 ±  6%  reaim.std_dev_time
    321.38            -4.7%     306.29        reaim.time.elapsed_time
    321.38            -4.7%     306.29        reaim.time.elapsed_time.max
    127264 ±  2%     +40.6%     178887        reaim.time.involuntary_context_switches
     93024 ±  2%      -8.4%      85173        reaim.time.minor_page_faults
      6448            -2.6%       6282        reaim.time.percent_of_cpu_this_job_got
     20724            -7.2%      19241        reaim.time.user_time
      4929            -6.9%       4591        reaim.time.voluntary_context_switches
   1512000            -7.1%    1404000        reaim.workload
    442.00 ±158%    -100.0%       0.00        meminfo.Mlocked
      0.03 ±  2%      +0.1        0.11        mpstat.cpu.sys%
      1231           +56.9%       1932 ±  3%  vmstat.system.cs
     63.75 ±158%    -100.0%       0.00        numa-vmstat.node0.nr_mlock
     46.25 ±159%    -100.0%       0.00        numa-vmstat.node1.nr_mlock
      1.50 ± 15%      -0.3        1.24 ± 15%  perf-stat.overall.cache-miss-rate%
     50.13 ±  3%      +6.4       56.58 ±  5%  perf-stat.overall.iTLB-load-miss-rate%
    566199 ±  6%     -23.7%     432237 ±  7%  perf-stat.overall.instructions-per-iTLB-miss
     59.86 ±  3%     -23.1       36.79 ± 10%  perf-stat.overall.node-store-miss-rate%
   6653969 ±  8%     +25.4%    8340919 ± 14%  perf-stat.ps.cache-references
      1202           +56.7%       1884 ±  4%  perf-stat.ps.context-switches
    185718 ±  5%     +28.9%     239386 ±  7%  perf-stat.ps.iTLB-load-misses
      2755            -1.5%       2713        perf-stat.ps.minor-faults
     22055 ± 22%     -41.6%      12879 ± 11%  perf-stat.ps.node-store-misses
     14552 ± 12%     +53.3%      22311 ± 14%  perf-stat.ps.node-stores
      2755            -1.5%       2713        perf-stat.ps.page-faults
 3.414e+13            -7.1%  3.172e+13        perf-stat.total.instructions
     69181            -1.7%      67973        proc-vmstat.nr_active_anon
      6177            +4.6%       6461        proc-vmstat.nr_mapped
    109.50 ±159%    -100.0%       0.00        proc-vmstat.nr_mlock
     69181            -1.7%      67973        proc-vmstat.nr_zone_active_anon
     17125 ± 10%     -55.2%       7668 ± 20%  proc-vmstat.numa_hint_faults
      4847 ± 80%     -81.0%     920.75 ± 85%  proc-vmstat.numa_hint_faults_local
    799546            -5.8%     753399        proc-vmstat.numa_hit
    785482            -5.9%     739309        proc-vmstat.numa_local
     63641 ± 34%     -57.8%      26863 ± 93%  proc-vmstat.numa_pte_updates
    861155            -5.6%     812594        proc-vmstat.pgalloc_normal
    922986            -6.5%     863086        proc-vmstat.pgfault
    833697            -5.8%     785608        proc-vmstat.pgfree
     17.88 ±  5%      -4.2       13.72 ± 21%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_sched_timer.__hrtimer_run_queues
      8.65 ± 10%      -2.8        5.83 ± 26%  perf-profile.calltrace.cycles-pp.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      3.81 ± 39%      -1.9        1.89 ± 30%  perf-profile.calltrace.cycles-pp.ktime_get.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      4.05 ±  6%      -1.3        2.72 ± 25%  perf-profile.calltrace.cycles-pp.ktime_get_update_offsets_now.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      3.42 ±100%      -1.3        2.11 ±173%  perf-profile.calltrace.cycles-pp.irq_work_interrupt
      3.42 ±100%      -1.3        2.11 ±173%  perf-profile.calltrace.cycles-pp.smp_irq_work_interrupt.irq_work_interrupt
      3.40 ±100%      -1.3        2.11 ±173%  perf-profile.calltrace.cycles-pp.irq_work_run_list.irq_work_run.smp_irq_work_interrupt.irq_work_interrupt
      3.40 ±100%      -1.3        2.11 ±173%  perf-profile.calltrace.cycles-pp.irq_work_run.smp_irq_work_interrupt.irq_work_interrupt
      3.40 ±100%      -1.1        2.27 ±173%  perf-profile.calltrace.cycles-pp.printk.irq_work_run_list.irq_work_run.smp_irq_work_interrupt.irq_work_interrupt
      3.40 ±100%      -1.1        2.27 ±173%  perf-profile.calltrace.cycles-pp.vprintk_emit.printk.irq_work_run_list.irq_work_run.smp_irq_work_interrupt
      3.40 ±100%      -1.1        2.27 ±173%  perf-profile.calltrace.cycles-pp.console_unlock.vprintk_emit.printk.irq_work_run_list.irq_work_run
      3.38 ±100%      -1.1        2.30 ±173%  perf-profile.calltrace.cycles-pp.secondary_startup_64
      3.38 ±100%      -1.1        2.30 ±173%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
      3.38 ±100%      -1.1        2.30 ±173%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
      3.37 ±100%      -1.1        2.30 ±173%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      3.33 ±100%      -1.0        2.29 ±173%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
      3.29 ±101%      -1.0        2.29 ±173%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
      2.59 ± 13%      -0.8        1.78 ± 20%  perf-profile.calltrace.cycles-pp.__update_load_avg_se.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle
      1.37 ± 11%      -0.7        0.65 ± 64%  perf-profile.calltrace.cycles-pp.timerqueue_del.__remove_hrtimer.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt
      2.93 ±101%      -0.7        2.21 ±173%  perf-profile.calltrace.cycles-pp.serial8250_console_write.console_unlock.vprintk_emit.printk.irq_work_run_list
      2.04 ± 13%      -0.7        1.35 ± 30%  perf-profile.calltrace.cycles-pp.update_curr.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle
      0.80 ± 16%      -0.5        0.30 ±100%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      0.00        +5.5e+11%       5530 ±110%  sched_debug.cfs_rq:/.MIN_vruntime.avg
     12338           +45.3%      17924 ±  7%  sched_debug.cfs_rq:/.load.avg
     39464 ± 43%    +709.9%     319609 ± 46%  sched_debug.cfs_rq:/.load.max
      5132 ± 59%    +704.6%      41295 ± 41%  sched_debug.cfs_rq:/.load.stddev
    299.73 ± 28%     +60.1%     479.92 ± 19%  sched_debug.cfs_rq:/.load_avg.max
     52.90 ± 37%     +48.5%      78.55 ± 11%  sched_debug.cfs_rq:/.load_avg.stddev
      0.00        +5.5e+11%       5530 ±110%  sched_debug.cfs_rq:/.max_vruntime.avg
     56400 ± 32%    +258.5%     202169 ±  5%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.08 ± 13%     +26.3%       0.10 ± 11%  sched_debug.cfs_rq:/.nr_running.stddev
     11.16 ±  6%     +26.1%      14.08 ±  4%  sched_debug.cfs_rq:/.runnable_load_avg.avg
     34.70 ± 30%    +486.6%     203.54 ± 30%  sched_debug.cfs_rq:/.runnable_load_avg.max
      3.74 ± 31%    +599.1%      26.17 ± 30%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
     12213           +46.1%      17837 ±  7%  sched_debug.cfs_rq:/.runnable_weight.avg
     35964 ± 43%    +788.0%     319358 ± 46%  sched_debug.cfs_rq:/.runnable_weight.max
      4732 ± 62%    +770.0%      41175 ± 42%  sched_debug.cfs_rq:/.runnable_weight.stddev
     40678 ±144%   +3754.8%    1568088 ±  2%  sched_debug.cfs_rq:/.spread0.avg
    127330 ± 55%   +1238.0%    1703696 ±  3%  sched_debug.cfs_rq:/.spread0.max
    -87205           -98.1%      -1677        sched_debug.cfs_rq:/.spread0.min
     56217 ± 32%    +259.5%     202080 ±  5%  sched_debug.cfs_rq:/.spread0.stddev
    698851 ±  8%     -16.4%     584422 ±  9%  sched_debug.cpu.avg_idle.min
    165933 ±  7%     +13.4%     188123        sched_debug.cpu.clock.avg
    165941 ±  7%     +13.4%     188129        sched_debug.cpu.clock.max
    165919 ±  7%     +13.4%     188115        sched_debug.cpu.clock.min
      5.35 ± 23%     -32.9%       3.59 ±  7%  sched_debug.cpu.clock.stddev
    165933 ±  7%     +13.4%     188123        sched_debug.cpu.clock_task.avg
    165941 ±  7%     +13.4%     188129        sched_debug.cpu.clock_task.max
    165919 ±  7%     +13.4%     188115        sched_debug.cpu.clock_task.min
      5.35 ± 23%     -32.9%       3.59 ±  7%  sched_debug.cpu.clock_task.stddev
     11.33 ±  4%     +19.3%      13.51 ±  9%  sched_debug.cpu.cpu_load[0].avg
     38.97 ± 28%    +418.2%     201.92 ± 30%  sched_debug.cpu.cpu_load[0].max
      4.46 ± 41%    +443.0%      24.23 ± 33%  sched_debug.cpu.cpu_load[0].stddev
     11.43 ±  4%     +17.6%      13.44 ±  8%  sched_debug.cpu.cpu_load[1].avg
     37.25 ± 26%    +418.9%     193.29 ± 30%  sched_debug.cpu.cpu_load[1].max
      4.08 ± 36%    +463.5%      23.00 ± 35%  sched_debug.cpu.cpu_load[1].stddev
     11.49 ±  4%     +16.2%      13.34 ±  7%  sched_debug.cpu.cpu_load[2].avg
     35.41 ± 21%    +398.6%     176.54 ± 30%  sched_debug.cpu.cpu_load[2].max
      3.74 ± 31%    +460.0%      20.96 ± 34%  sched_debug.cpu.cpu_load[2].stddev
     34.54 ± 14%    +336.3%     150.71 ± 21%  sched_debug.cpu.cpu_load[3].max
      3.51 ± 22%    +407.6%      17.84 ± 27%  sched_debug.cpu.cpu_load[3].stddev
     55.55          +164.1%     146.71 ± 15%  sched_debug.cpu.cpu_load[4].max
      5.57 ±  6%    +202.7%      16.85 ± 18%  sched_debug.cpu.cpu_load[4].stddev
      4396 ±  6%     +13.1%       4974 ±  7%  sched_debug.cpu.curr->pid.avg
      3517 ± 17%     -47.9%       1832 ± 34%  sched_debug.cpu.curr->pid.min
    305.14 ± 37%     +98.3%     604.98 ± 22%  sched_debug.cpu.curr->pid.stddev
     12364           +39.5%      17254 ± 14%  sched_debug.cpu.load.avg
     39007 ± 44%    +721.4%     320411 ± 46%  sched_debug.cpu.load.max
      5044 ± 63%    +681.3%      39414 ± 48%  sched_debug.cpu.load.stddev
    133158 ±  9%     +16.9%     155714        sched_debug.cpu.nr_load_updates.avg
    140928 ±  9%     +15.6%     162887        sched_debug.cpu.nr_load_updates.max
      1369 ± 37%     +66.8%       2283 ±  4%  sched_debug.cpu.nr_load_updates.stddev
      0.19 ±  3%     +22.7%       0.23 ± 15%  sched_debug.cpu.nr_running.stddev
      3143 ±  6%     +52.3%       4788 ±  3%  sched_debug.cpu.nr_switches.avg
     11472 ± 19%     +67.3%      19197 ±  3%  sched_debug.cpu.nr_switches.max
      2464 ± 19%     +73.5%       4275 ±  6%  sched_debug.cpu.nr_switches.stddev
    165920 ±  7%     +13.4%     188116        sched_debug.cpu_clk
    162590 ±  7%     +13.6%     184783        sched_debug.ktime
    167206 ±  6%     +13.6%     189868        sched_debug.sched_clk


                                                                                
                        ftq.time.involuntary_context_switches                   
                                                                                
  260000 +-+----------------------------------------------------------------+   
         |O                                                                 |   
  240000 +-+ O                                                              |   
  220000 O-+   O   O                                                        |   
         |  O   O O  O                                          O           |   
  200000 +-+                                            O                   |   
  180000 +-+                                 O OO OO OO  O OOO O  OO OO OO OO   
         |                  O         O  O                                  |   
  160000 +-+          O OOO  O  O  O O  O  O                                |   
  140000 +-+                   O  O         O                               |   
         |                                                                  |   
  120000 +-+                          +                                     |   
  100000 +-+ +.           +.     .+   :+                                    |   
         |+.+  ++.++.++.++  ++.++  +.+  ++.+++.++.++.++.++.+++.++           |   
   80000 +-+----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-hsw-ep2: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 128G memory
=========================================================================================
compiler/kconfig/rootfs/tbox_group/testcase/type/ucode:
  gcc-7/x86_64-rhel-7.2/debian-x86_64-2018-04-03.cgz/lkp-hsw-ep2/perf_test/lkp/0x3d

commit: 
  73a049a90f ("ACPICA: disassembler: disassemble OEMx tables as AML")
  4c1379d7bb ("ACPICA: Debug output: Add option to display method/object evaluation")

73a049a90fb241af 4c1379d7bb42fa664e07845392 
---------------- -------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :8          100%           8:8     perf_test.msr_test.round1.cpu_util_zero.fail
         %stddev     %change         %stddev
             \          |                \  
      1.00          -100.0%       0.00        perf_test.msr_test.round1.cpu_util_zero.pass
      1432           +13.1%       1620 ±  3%  perf_test.time.elapsed_time
      1432           +13.1%       1620 ±  3%  perf_test.time.elapsed_time.max
     43385            -6.9%      40403 ±  2%  perf_test.time.involuntary_context_switches
     76.62 ±  2%     +16.3%      89.12 ±  2%  perf_test.time.percent_of_cpu_this_job_got
    295.24           +17.2%     346.06 ± 13%  perf_test.time.system_time
    810.50           +36.5%       1106 ±  2%  perf_test.time.user_time
    360596           +43.1%     515953 ± 17%  perf_test.time.voluntary_context_switches
     16568           +46.1%      24206        meminfo.Inactive(anon)
     30398           +16.8%      35515        meminfo.Mapped
      1843 ± 10%      +9.1%       2010 ±  7%  meminfo.Mlocked
     19473           +39.3%      27124        meminfo.Shmem
      2897           -11.6%       2561 ±  3%  meminfo.max_used_kB





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-4.20.0-rc3-00015-g4c1379d" of type "text/plain" (108428 bytes)

View attachment "job-script" of type "text/plain" (4411 bytes)

View attachment "job.yaml" of type "text/plain" (3801 bytes)

View attachment "reproduce" of type "text/plain" (314 bytes)

Powered by blists - more mailing lists