lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200414085853.GO8179@shao2-debian>
Date:   Tue, 14 Apr 2020 16:58:53 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [perf/core] 90c91dfb86: fxmark.ssd_f2fs_DRBL_1_directio.works/sec
 18.2% improvement

Greeting,

FYI, we noticed a 18.2% improvement of fxmark.ssd_f2fs_DRBL_1_directio.works/sec due to commit:


commit: 90c91dfb86d0ff545bd329d3ddd72c147e2ae198 ("perf/core: Fix endless multiplex timer")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

in testcase: fxmark
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:

	disk: 1SSD
	media: ssd
	test: DRBL
	fstype: f2fs
	directio: directio
	cpufreq_governor: performance
	ucode: 0x400002c






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/directio/disk/fstype/kconfig/media/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/directio/1SSD/f2fs/x86_64-rhel-7.6/ssd/debian-x86_64-20191114.cgz/lkp-csl-2ap1/DRBL/fxmark/0x400002c

commit: 
  d8a7386897 ("x86/optprobe: Fix OPTPROBE vs UACCESS")
  90c91dfb86 ("perf/core: Fix endless multiplex timer")

d8a738689794c42c 90c91dfb86d0ff545bd329d3ddd 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           50%           2:4     dmesg.WARNING:at#for_ip_swapgs_restore_regs_and_return_to_usermode/0x
           :4           50%           2:4     dmesg.WARNING:stack_recursion
          0:4            1%           0:4     perf-profile.children.cycles-pp.error_entry
         %stddev     %change         %stddev
             \          |                \  
     16.79           +27.6%      21.41        fxmark.ssd_f2fs_DRBL_1_directio.iowait_sec
     18.45 ±  2%     -24.0%      14.02 ± 22%  fxmark.ssd_f2fs_DRBL_1_directio.sys_util
      2.17 ±  2%     -24.9%       1.63 ± 14%  fxmark.ssd_f2fs_DRBL_1_directio.user_util
    736006           +18.2%     869786        fxmark.ssd_f2fs_DRBL_1_directio.works
     24533           +18.2%      28992        fxmark.ssd_f2fs_DRBL_1_directio.works/sec
  26010490 ±  2%      +6.0%   27559732        fxmark.time.file_system_inputs
   3219151 ±  2%      +6.0%    3413339        fxmark.time.voluntary_context_switches
  75000192           -10.2%   67344189        cpuidle.POLL.time
      7.24 ± 67%     -68.8%       2.26 ± 18%  iostat.nvme0n1.await.max
      7.27 ± 67%     -68.7%       2.28 ± 17%  iostat.nvme0n1.w_await.max
    196991 ± 50%     -74.1%      50950 ± 66%  numa-numastat.node3.local_node
    220900 ± 41%     -62.6%      82622 ± 41%  numa-numastat.node3.numa_hit
      7151 ±  5%      -7.1%       6640 ±  2%  slabinfo.anon_vma.active_objs
      3230 ±  3%      -8.5%       2954 ±  4%  slabinfo.files_cache.num_objs
   2518633           -32.3%    1706174 ±  2%  proc-vmstat.pgalloc_normal
      1684            +4.7%       1763 ±  4%  proc-vmstat.pgdeactivate
   2505517           -31.1%    1726169 ±  3%  proc-vmstat.pgfree
      1586 ±  3%     -96.2%      60.50 ± 66%  proc-vmstat.thp_fault_alloc
     71788 ± 12%     -47.7%      37550 ± 67%  sched_debug.cfs_rq:/.load.min
     72.38 ± 12%     -46.6%      38.67 ± 64%  sched_debug.cfs_rq:/.load_avg.min
     10.17 ±  7%     -29.9%       7.12 ± 50%  sched_debug.cfs_rq:/.nr_spread_over.min
    355.54 ±  9%     -14.7%     303.12 ±  7%  sched_debug.cfs_rq:/.runnable_load_avg.max
     29913 ± 70%     -74.8%       7537 ± 71%  numa-vmstat.node1.nr_active_anon
     29915 ± 70%     -74.8%       7538 ± 71%  numa-vmstat.node1.nr_anon_pages
     29913 ± 70%     -74.8%       7537 ± 71%  numa-vmstat.node1.nr_zone_active_anon
    385.50 ± 68%     -60.0%     154.25 ± 37%  numa-vmstat.node3.nr_page_table_pages
     11469 ± 11%     -12.1%      10080 ±  8%  numa-vmstat.node3.nr_slab_unreclaimable
    119643 ± 70%     -74.8%      30148 ± 71%  numa-meminfo.node1.Active
    119643 ± 70%     -74.8%      30148 ± 71%  numa-meminfo.node1.Active(anon)
     86988 ± 71%     -87.1%      11205 ±118%  numa-meminfo.node1.AnonHugePages
    119650 ± 70%     -74.8%      30154 ± 71%  numa-meminfo.node1.AnonPages
      1547 ± 67%     -60.0%     618.50 ± 37%  numa-meminfo.node3.PageTables
     45893 ± 11%     -12.2%      40313 ±  8%  numa-meminfo.node3.SUnreclaim
    261.00 ± 61%     -66.0%      88.75 ± 42%  interrupts.CPU100.31:PCI-MSI.524289-edge.eth0-TxRx-0
      6.25 ± 17%    +416.0%      32.25 ±128%  interrupts.CPU131.RES:Rescheduling_interrupts
      5.25 ± 49%    +709.5%      42.50 ±120%  interrupts.CPU166.RES:Rescheduling_interrupts
    109.75 ± 10%     -64.0%      39.50 ±105%  interrupts.CPU177.NMI:Non-maskable_interrupts
    109.75 ± 10%     -64.0%      39.50 ±105%  interrupts.CPU177.PMI:Performance_monitoring_interrupts
    109.75 ± 11%     -58.5%      45.50 ±100%  interrupts.CPU178.NMI:Non-maskable_interrupts
    109.75 ± 11%     -58.5%      45.50 ±100%  interrupts.CPU178.PMI:Performance_monitoring_interrupts
     86.00 ± 17%    +284.9%     331.00 ± 61%  interrupts.CPU2.NMI:Non-maskable_interrupts
     86.00 ± 17%    +284.9%     331.00 ± 61%  interrupts.CPU2.PMI:Performance_monitoring_interrupts
    105.25 ± 28%    +290.0%     410.50 ± 68%  interrupts.CPU3.NMI:Non-maskable_interrupts
    105.25 ± 28%    +290.0%     410.50 ± 68%  interrupts.CPU3.PMI:Performance_monitoring_interrupts
    403.75 ± 11%    +262.6%       1464 ±104%  interrupts.CPU3.RES:Rescheduling_interrupts
    142.00 ± 31%    +146.1%     349.50 ± 58%  interrupts.CPU4.NMI:Non-maskable_interrupts
    142.00 ± 31%    +146.1%     349.50 ± 58%  interrupts.CPU4.PMI:Performance_monitoring_interrupts
     76.00 ± 39%    +338.5%     333.25 ± 90%  interrupts.CPU5.NMI:Non-maskable_interrupts
     76.00 ± 39%    +338.5%     333.25 ± 90%  interrupts.CPU5.PMI:Performance_monitoring_interrupts
    131.50 ± 24%    +100.8%     264.00 ± 49%  interrupts.CPU6.RES:Rescheduling_interrupts
      1124 ±172%    -100.0%       0.25 ±173%  interrupts.CPU95.TLB:TLB_shootdowns
      4474 ±  8%     +61.9%       7245 ± 37%  interrupts.NMI:Non-maskable_interrupts
      4474 ±  8%     +61.9%       7245 ± 37%  interrupts.PMI:Performance_monitoring_interrupts
     37.99 ± 10%     -18.0       20.03 ± 61%  perf-profile.calltrace.cycles-pp.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
     33.38 ± 10%     -16.4       16.98 ± 62%  perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle
     18.63 ±  7%     -11.0        7.66 ± 57%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
     11.82 ±  5%      -7.9        3.94 ± 53%  perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
      3.81 ±  9%      -3.1        0.70 ± 69%  perf-profile.calltrace.cycles-pp.perf_mux_hrtimer_handler.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      3.13 ± 16%      -1.8        1.33 ± 75%  perf-profile.calltrace.cycles-pp.serial8250_console_putchar.uart_console_write.serial8250_console_write.console_unlock.vprintk_emit
      3.27 ± 18%      -1.7        1.54 ± 64%  perf-profile.calltrace.cycles-pp.uart_console_write.serial8250_console_write.console_unlock.vprintk_emit.printk
      3.13 ± 16%      -1.7        1.46 ± 57%  perf-profile.calltrace.cycles-pp.wait_for_xmitr.serial8250_console_putchar.uart_console_write.serial8250_console_write.console_unlock
      1.72 ± 33%      -1.0        0.67 ± 74%  perf-profile.calltrace.cycles-pp.io_serial_in.wait_for_xmitr.serial8250_console_putchar.uart_console_write.serial8250_console_write
      1.12 ± 11%      -0.7        0.44 ±101%  perf-profile.calltrace.cycles-pp.lapic_next_deadline.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
      1.55 ± 19%      -0.7        0.89 ± 58%  perf-profile.calltrace.cycles-pp.irq_enter.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      0.97 ± 13%      -0.6        0.38 ±100%  perf-profile.calltrace.cycles-pp.native_write_msr.lapic_next_deadline.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt
     36.02 ± 10%     -16.9       19.15 ± 58%  perf-profile.children.cycles-pp.apic_timer_interrupt
     33.51 ± 10%     -16.1       17.36 ± 59%  perf-profile.children.cycles-pp.smp_apic_timer_interrupt
     18.75 ±  7%     -10.6        8.20 ± 52%  perf-profile.children.cycles-pp.hrtimer_interrupt
     11.98 ±  5%      -7.5        4.45 ± 43%  perf-profile.children.cycles-pp.__hrtimer_run_queues
     66.76 ±  4%      -7.2       59.58 ± 11%  perf-profile.children.cycles-pp.cpuidle_enter_state
      4.08 ±  9%      -3.3        0.83 ± 50%  perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
      2.83 ±  7%      -1.8        1.02 ± 36%  perf-profile.children.cycles-pp.native_write_msr
      3.52 ± 17%      -1.5        2.00 ± 50%  perf-profile.children.cycles-pp.printk
      3.52 ± 17%      -1.5        2.00 ± 50%  perf-profile.children.cycles-pp.vprintk_emit
      1.79 ± 15%      -1.5        0.34 ± 67%  perf-profile.children.cycles-pp.__intel_pmu_enable_all
      3.51 ± 18%      -1.4        2.14 ± 38%  perf-profile.children.cycles-pp.console_unlock
      3.38 ± 17%      -1.3        2.04 ± 39%  perf-profile.children.cycles-pp.serial8250_console_write
      3.27 ± 19%      -1.3        1.96 ± 39%  perf-profile.children.cycles-pp.uart_console_write
      3.24 ± 16%      -1.3        1.95 ± 41%  perf-profile.children.cycles-pp.wait_for_xmitr
      3.13 ± 17%      -1.3        1.86 ± 41%  perf-profile.children.cycles-pp.serial8250_console_putchar
      1.33 ± 22%      -1.2        0.11 ± 69%  perf-profile.children.cycles-pp.enqueue_hrtimer
      1.25 ± 23%      -1.2        0.10 ± 76%  perf-profile.children.cycles-pp.timerqueue_add
      1.00 ± 29%      -0.9        0.11 ± 74%  perf-profile.children.cycles-pp.__remove_hrtimer
      0.88 ± 29%      -0.8        0.07 ±112%  perf-profile.children.cycles-pp.timerqueue_del
      0.65 ± 30%      -0.6        0.04 ±110%  perf-profile.children.cycles-pp.rb_erase
      1.23 ± 26%      -0.6        0.65 ± 22%  perf-profile.children.cycles-pp._raw_spin_lock
      1.17 ±  8%      -0.5        0.65 ± 46%  perf-profile.children.cycles-pp.lapic_next_deadline
      1.44 ± 10%      -0.5        0.94 ± 11%  perf-profile.children.cycles-pp.read_tsc
      1.56 ± 18%      -0.5        1.07 ± 38%  perf-profile.children.cycles-pp.irq_enter
      1.03 ± 12%      -0.5        0.54 ± 43%  perf-profile.children.cycles-pp.delay_tsc
      0.60 ± 22%      -0.4        0.19 ± 64%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      1.04 ± 23%      -0.4        0.64 ± 39%  perf-profile.children.cycles-pp.page_fault
      0.95 ± 23%      -0.4        0.57 ± 40%  perf-profile.children.cycles-pp.do_page_fault
      1.36 ± 12%      -0.4        1.01 ± 11%  perf-profile.children.cycles-pp.native_irq_return_iret
      0.61 ± 10%      -0.3        0.27 ± 79%  perf-profile.children.cycles-pp.timekeeping_max_deferment
      0.68 ± 27%      -0.3        0.36 ± 51%  perf-profile.children.cycles-pp.tick_check_oneshot_broadcast_this_cpu
      0.83 ± 26%      -0.3        0.52 ± 41%  perf-profile.children.cycles-pp.__handle_mm_fault
      0.84 ± 27%      -0.3        0.53 ± 41%  perf-profile.children.cycles-pp.handle_mm_fault
      0.24 ± 43%      -0.2        0.07 ±100%  perf-profile.children.cycles-pp.mmap_region
      0.27 ± 19%      -0.2        0.09 ± 30%  perf-profile.children.cycles-pp.setlocale
      0.19 ± 26%      -0.1        0.04 ±113%  perf-profile.children.cycles-pp.pipe_read
      0.33 ± 14%      -0.1        0.19 ± 38%  perf-profile.children.cycles-pp.newidle_balance
      0.17 ± 16%      -0.1        0.05 ±116%  perf-profile.children.cycles-pp.rb_next
      0.16 ± 20%      -0.1        0.07 ± 61%  perf-profile.children.cycles-pp.update_blocked_averages
      0.12 ± 32%      -0.1        0.05 ±106%  perf-profile.children.cycles-pp.fbcon_putcs
      0.09 ± 17%      -0.1        0.03 ±100%  perf-profile.children.cycles-pp.trigger_load_balance
      0.11 ± 28%      -0.1        0.05 ±106%  perf-profile.children.cycles-pp.bit_putcs
      0.01 ±173%      +0.1        0.13 ± 59%  perf-profile.children.cycles-pp.__slab_free
      0.04 ±102%      +0.4        0.40 ± 80%  perf-profile.children.cycles-pp.update_load_avg
      0.09 ± 61%      +0.7        0.77 ± 85%  perf-profile.children.cycles-pp.schedule_idle
      0.09 ± 64%      +9.5        9.54 ±113%  perf-profile.children.cycles-pp.poll_idle
      2.81 ±  7%      -1.8        1.02 ± 36%  perf-profile.self.cycles-pp.native_write_msr
      0.69 ± 30%      -0.6        0.06 ±116%  perf-profile.self.cycles-pp.timerqueue_add
      1.23 ± 26%      -0.6        0.63 ± 22%  perf-profile.self.cycles-pp._raw_spin_lock
      0.63 ± 29%      -0.6        0.04 ±106%  perf-profile.self.cycles-pp.rb_erase
      1.41 ± 10%      -0.5        0.89 ± 12%  perf-profile.self.cycles-pp.read_tsc
      1.03 ± 12%      -0.5        0.54 ± 43%  perf-profile.self.cycles-pp.delay_tsc
      0.58 ± 25%      -0.4        0.19 ± 64%  perf-profile.self.cycles-pp._raw_spin_lock_irq
      0.55 ± 23%      -0.4        0.16 ± 61%  perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
      0.61 ± 10%      -0.4        0.24 ± 89%  perf-profile.self.cycles-pp.timekeeping_max_deferment
      1.36 ± 12%      -0.4        1.01 ± 11%  perf-profile.self.cycles-pp.native_irq_return_iret
      0.68 ± 27%      -0.3        0.36 ± 51%  perf-profile.self.cycles-pp.tick_check_oneshot_broadcast_this_cpu
      0.31 ± 15%      -0.2        0.15 ± 73%  perf-profile.self.cycles-pp.__hrtimer_run_queues
      0.18 ± 43%      -0.1        0.05 ±110%  perf-profile.self.cycles-pp.clockevents_program_event
      0.14 ± 26%      -0.1        0.05 ±114%  perf-profile.self.cycles-pp.rb_next
      0.11 ± 30%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.__remove_hrtimer
      0.11 ± 13%      -0.1        0.04 ±103%  perf-profile.self.cycles-pp.__note_gp_changes
      0.01 ±173%      +0.1        0.09 ± 16%  perf-profile.self.cycles-pp.tick_nohz_get_sleep_length
      0.08 ± 73%      +0.1        0.16 ±  7%  perf-profile.self.cycles-pp.find_next_bit
      0.00            +0.1        0.08 ± 28%  perf-profile.self.cycles-pp.cpuidle_enter
      0.01 ±173%      +0.1        0.13 ± 59%  perf-profile.self.cycles-pp.__slab_free
      0.08 ± 61%      +8.7        8.77 ±113%  perf-profile.self.cycles-pp.poll_idle


                                                                                
                        fxmark.ssd_f2fs_DRBL_1_directio.works                   
                                                                                
  900000 +------------------------------------------------------------------+   
         |                       O                                O         |   
  880000 |-+           O    O  O            O  O    O            O          |   
  860000 |-OO OO   OO         O    OO OO O    O       OO OO O O     O O     |   
         |            O  O O               O     OO                         |   
  840000 |-+     O                                             O            |   
  820000 |-+                                                                |   
         |                                                                  |   
  800000 |-+                                                                |   
  780000 |-+                                                      +         |   
         |                                                        :+        |   
  760000 |.+         .++.+.  .+       +           +.+. +.  .+.++.+  +       |   
  740000 |-++.+ .+. +      ++  +.+.+ + +.+.+     +    +  ++          :      |   
         |     +   +                +       +.+ +                    : +.++.|   
  720000 +------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                      fxmark.ssd_f2fs_DRBL_1_directio.works_sec                 
                                                                                
  30000 +-------------------------------------------------------------------+   
        |                       O                                 O         |   
  29000 |-+            O   O   O            O  O    O            O          |   
        | OO OO   OO         O    O OO OO    O       O O OO OO      O O     |   
        |            O  O O               O     O O                         |   
  28000 |-+     O                                              O            |   
        |                                                                   |   
  27000 |-+                                                                 |   
        |                                                                   |   
  26000 |-+                                                       +         |   
        |                                                         :+        |   
        |.          .+.++.  .+.      +.          .+.+ .+.  .++.+.+  +       |   
  25000 |-++.+ .+. +      ++   ++.+. : ++.+.    +    +   ++          :      |   
        |     +   +                 +       ++. :                    : +.++.|   
  24000 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                   fxmark.ssd_f2fs_DRBL_1_directio.iowait_sec                   
                                                                                
  22 +----------------------------------------------------------------------+   
     |                                   OO    O    O                       |   
  21 |-OO O OO   O OO O OO O OO O OO O O    O O  O O  O OO O OO O   OO      |   
     |         O                                                            |   
     |                                                                      |   
  20 |-+                                                                    |   
     |                                                                      |   
  19 |-+                                                                    |   
     |                                                                      |   
  18 |-+                                                                    |   
     |                                                         .+.+.+       |   
     |.         .+.++.+.++.+.      +.           .+.++.+.++.+.++     :       |   
  17 |-++.+.++.+             ++.+.+  +.+.++.+. +                     :.+.+ .|   
     |                                        +                      +    + |   
  16 +----------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.6.0-rc6-00081-g90c91dfb86d0f" of type "text/plain" (204871 bytes)

View attachment "job-script" of type "text/plain" (7761 bytes)

View attachment "job.yaml" of type "text/plain" (5428 bytes)

View attachment "reproduce" of type "text/plain" (254 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ