[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200414085853.GO8179@shao2-debian>
Date: Tue, 14 Apr 2020 16:58:53 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Andi Kleen <ak@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [perf/core] 90c91dfb86: fxmark.ssd_f2fs_DRBL_1_directio.works/sec
18.2% improvement
Greeting,
FYI, we noticed a 18.2% improvement of fxmark.ssd_f2fs_DRBL_1_directio.works/sec due to commit:
commit: 90c91dfb86d0ff545bd329d3ddd72c147e2ae198 ("perf/core: Fix endless multiplex timer")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: fxmark
on test machine: 192 threads Intel(R) Xeon(R) CPU @ 2.20GHz with 192G memory
with following parameters:
disk: 1SSD
media: ssd
test: DRBL
fstype: f2fs
directio: directio
cpufreq_governor: performance
ucode: 0x400002c
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/directio/disk/fstype/kconfig/media/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/directio/1SSD/f2fs/x86_64-rhel-7.6/ssd/debian-x86_64-20191114.cgz/lkp-csl-2ap1/DRBL/fxmark/0x400002c
commit:
d8a7386897 ("x86/optprobe: Fix OPTPROBE vs UACCESS")
90c91dfb86 ("perf/core: Fix endless multiplex timer")
d8a738689794c42c 90c91dfb86d0ff545bd329d3ddd
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 50% 2:4 dmesg.WARNING:at#for_ip_swapgs_restore_regs_and_return_to_usermode/0x
:4 50% 2:4 dmesg.WARNING:stack_recursion
0:4 1% 0:4 perf-profile.children.cycles-pp.error_entry
%stddev %change %stddev
\ | \
16.79 +27.6% 21.41 fxmark.ssd_f2fs_DRBL_1_directio.iowait_sec
18.45 ± 2% -24.0% 14.02 ± 22% fxmark.ssd_f2fs_DRBL_1_directio.sys_util
2.17 ± 2% -24.9% 1.63 ± 14% fxmark.ssd_f2fs_DRBL_1_directio.user_util
736006 +18.2% 869786 fxmark.ssd_f2fs_DRBL_1_directio.works
24533 +18.2% 28992 fxmark.ssd_f2fs_DRBL_1_directio.works/sec
26010490 ± 2% +6.0% 27559732 fxmark.time.file_system_inputs
3219151 ± 2% +6.0% 3413339 fxmark.time.voluntary_context_switches
75000192 -10.2% 67344189 cpuidle.POLL.time
7.24 ± 67% -68.8% 2.26 ± 18% iostat.nvme0n1.await.max
7.27 ± 67% -68.7% 2.28 ± 17% iostat.nvme0n1.w_await.max
196991 ± 50% -74.1% 50950 ± 66% numa-numastat.node3.local_node
220900 ± 41% -62.6% 82622 ± 41% numa-numastat.node3.numa_hit
7151 ± 5% -7.1% 6640 ± 2% slabinfo.anon_vma.active_objs
3230 ± 3% -8.5% 2954 ± 4% slabinfo.files_cache.num_objs
2518633 -32.3% 1706174 ± 2% proc-vmstat.pgalloc_normal
1684 +4.7% 1763 ± 4% proc-vmstat.pgdeactivate
2505517 -31.1% 1726169 ± 3% proc-vmstat.pgfree
1586 ± 3% -96.2% 60.50 ± 66% proc-vmstat.thp_fault_alloc
71788 ± 12% -47.7% 37550 ± 67% sched_debug.cfs_rq:/.load.min
72.38 ± 12% -46.6% 38.67 ± 64% sched_debug.cfs_rq:/.load_avg.min
10.17 ± 7% -29.9% 7.12 ± 50% sched_debug.cfs_rq:/.nr_spread_over.min
355.54 ± 9% -14.7% 303.12 ± 7% sched_debug.cfs_rq:/.runnable_load_avg.max
29913 ± 70% -74.8% 7537 ± 71% numa-vmstat.node1.nr_active_anon
29915 ± 70% -74.8% 7538 ± 71% numa-vmstat.node1.nr_anon_pages
29913 ± 70% -74.8% 7537 ± 71% numa-vmstat.node1.nr_zone_active_anon
385.50 ± 68% -60.0% 154.25 ± 37% numa-vmstat.node3.nr_page_table_pages
11469 ± 11% -12.1% 10080 ± 8% numa-vmstat.node3.nr_slab_unreclaimable
119643 ± 70% -74.8% 30148 ± 71% numa-meminfo.node1.Active
119643 ± 70% -74.8% 30148 ± 71% numa-meminfo.node1.Active(anon)
86988 ± 71% -87.1% 11205 ±118% numa-meminfo.node1.AnonHugePages
119650 ± 70% -74.8% 30154 ± 71% numa-meminfo.node1.AnonPages
1547 ± 67% -60.0% 618.50 ± 37% numa-meminfo.node3.PageTables
45893 ± 11% -12.2% 40313 ± 8% numa-meminfo.node3.SUnreclaim
261.00 ± 61% -66.0% 88.75 ± 42% interrupts.CPU100.31:PCI-MSI.524289-edge.eth0-TxRx-0
6.25 ± 17% +416.0% 32.25 ±128% interrupts.CPU131.RES:Rescheduling_interrupts
5.25 ± 49% +709.5% 42.50 ±120% interrupts.CPU166.RES:Rescheduling_interrupts
109.75 ± 10% -64.0% 39.50 ±105% interrupts.CPU177.NMI:Non-maskable_interrupts
109.75 ± 10% -64.0% 39.50 ±105% interrupts.CPU177.PMI:Performance_monitoring_interrupts
109.75 ± 11% -58.5% 45.50 ±100% interrupts.CPU178.NMI:Non-maskable_interrupts
109.75 ± 11% -58.5% 45.50 ±100% interrupts.CPU178.PMI:Performance_monitoring_interrupts
86.00 ± 17% +284.9% 331.00 ± 61% interrupts.CPU2.NMI:Non-maskable_interrupts
86.00 ± 17% +284.9% 331.00 ± 61% interrupts.CPU2.PMI:Performance_monitoring_interrupts
105.25 ± 28% +290.0% 410.50 ± 68% interrupts.CPU3.NMI:Non-maskable_interrupts
105.25 ± 28% +290.0% 410.50 ± 68% interrupts.CPU3.PMI:Performance_monitoring_interrupts
403.75 ± 11% +262.6% 1464 ±104% interrupts.CPU3.RES:Rescheduling_interrupts
142.00 ± 31% +146.1% 349.50 ± 58% interrupts.CPU4.NMI:Non-maskable_interrupts
142.00 ± 31% +146.1% 349.50 ± 58% interrupts.CPU4.PMI:Performance_monitoring_interrupts
76.00 ± 39% +338.5% 333.25 ± 90% interrupts.CPU5.NMI:Non-maskable_interrupts
76.00 ± 39% +338.5% 333.25 ± 90% interrupts.CPU5.PMI:Performance_monitoring_interrupts
131.50 ± 24% +100.8% 264.00 ± 49% interrupts.CPU6.RES:Rescheduling_interrupts
1124 ±172% -100.0% 0.25 ±173% interrupts.CPU95.TLB:TLB_shootdowns
4474 ± 8% +61.9% 7245 ± 37% interrupts.NMI:Non-maskable_interrupts
4474 ± 8% +61.9% 7245 ± 37% interrupts.PMI:Performance_monitoring_interrupts
37.99 ± 10% -18.0 20.03 ± 61% perf-profile.calltrace.cycles-pp.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
33.38 ± 10% -16.4 16.98 ± 62% perf-profile.calltrace.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle
18.63 ± 7% -11.0 7.66 ± 57% perf-profile.calltrace.cycles-pp.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
11.82 ± 5% -7.9 3.94 ± 53% perf-profile.calltrace.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state
3.81 ± 9% -3.1 0.70 ± 69% perf-profile.calltrace.cycles-pp.perf_mux_hrtimer_handler.__hrtimer_run_queues.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
3.13 ± 16% -1.8 1.33 ± 75% perf-profile.calltrace.cycles-pp.serial8250_console_putchar.uart_console_write.serial8250_console_write.console_unlock.vprintk_emit
3.27 ± 18% -1.7 1.54 ± 64% perf-profile.calltrace.cycles-pp.uart_console_write.serial8250_console_write.console_unlock.vprintk_emit.printk
3.13 ± 16% -1.7 1.46 ± 57% perf-profile.calltrace.cycles-pp.wait_for_xmitr.serial8250_console_putchar.uart_console_write.serial8250_console_write.console_unlock
1.72 ± 33% -1.0 0.67 ± 74% perf-profile.calltrace.cycles-pp.io_serial_in.wait_for_xmitr.serial8250_console_putchar.uart_console_write.serial8250_console_write
1.12 ± 11% -0.7 0.44 ±101% perf-profile.calltrace.cycles-pp.lapic_next_deadline.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
1.55 ± 19% -0.7 0.89 ± 58% perf-profile.calltrace.cycles-pp.irq_enter.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
0.97 ± 13% -0.6 0.38 ±100% perf-profile.calltrace.cycles-pp.native_write_msr.lapic_next_deadline.clockevents_program_event.hrtimer_interrupt.smp_apic_timer_interrupt
36.02 ± 10% -16.9 19.15 ± 58% perf-profile.children.cycles-pp.apic_timer_interrupt
33.51 ± 10% -16.1 17.36 ± 59% perf-profile.children.cycles-pp.smp_apic_timer_interrupt
18.75 ± 7% -10.6 8.20 ± 52% perf-profile.children.cycles-pp.hrtimer_interrupt
11.98 ± 5% -7.5 4.45 ± 43% perf-profile.children.cycles-pp.__hrtimer_run_queues
66.76 ± 4% -7.2 59.58 ± 11% perf-profile.children.cycles-pp.cpuidle_enter_state
4.08 ± 9% -3.3 0.83 ± 50% perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
2.83 ± 7% -1.8 1.02 ± 36% perf-profile.children.cycles-pp.native_write_msr
3.52 ± 17% -1.5 2.00 ± 50% perf-profile.children.cycles-pp.printk
3.52 ± 17% -1.5 2.00 ± 50% perf-profile.children.cycles-pp.vprintk_emit
1.79 ± 15% -1.5 0.34 ± 67% perf-profile.children.cycles-pp.__intel_pmu_enable_all
3.51 ± 18% -1.4 2.14 ± 38% perf-profile.children.cycles-pp.console_unlock
3.38 ± 17% -1.3 2.04 ± 39% perf-profile.children.cycles-pp.serial8250_console_write
3.27 ± 19% -1.3 1.96 ± 39% perf-profile.children.cycles-pp.uart_console_write
3.24 ± 16% -1.3 1.95 ± 41% perf-profile.children.cycles-pp.wait_for_xmitr
3.13 ± 17% -1.3 1.86 ± 41% perf-profile.children.cycles-pp.serial8250_console_putchar
1.33 ± 22% -1.2 0.11 ± 69% perf-profile.children.cycles-pp.enqueue_hrtimer
1.25 ± 23% -1.2 0.10 ± 76% perf-profile.children.cycles-pp.timerqueue_add
1.00 ± 29% -0.9 0.11 ± 74% perf-profile.children.cycles-pp.__remove_hrtimer
0.88 ± 29% -0.8 0.07 ±112% perf-profile.children.cycles-pp.timerqueue_del
0.65 ± 30% -0.6 0.04 ±110% perf-profile.children.cycles-pp.rb_erase
1.23 ± 26% -0.6 0.65 ± 22% perf-profile.children.cycles-pp._raw_spin_lock
1.17 ± 8% -0.5 0.65 ± 46% perf-profile.children.cycles-pp.lapic_next_deadline
1.44 ± 10% -0.5 0.94 ± 11% perf-profile.children.cycles-pp.read_tsc
1.56 ± 18% -0.5 1.07 ± 38% perf-profile.children.cycles-pp.irq_enter
1.03 ± 12% -0.5 0.54 ± 43% perf-profile.children.cycles-pp.delay_tsc
0.60 ± 22% -0.4 0.19 ± 64% perf-profile.children.cycles-pp._raw_spin_lock_irq
1.04 ± 23% -0.4 0.64 ± 39% perf-profile.children.cycles-pp.page_fault
0.95 ± 23% -0.4 0.57 ± 40% perf-profile.children.cycles-pp.do_page_fault
1.36 ± 12% -0.4 1.01 ± 11% perf-profile.children.cycles-pp.native_irq_return_iret
0.61 ± 10% -0.3 0.27 ± 79% perf-profile.children.cycles-pp.timekeeping_max_deferment
0.68 ± 27% -0.3 0.36 ± 51% perf-profile.children.cycles-pp.tick_check_oneshot_broadcast_this_cpu
0.83 ± 26% -0.3 0.52 ± 41% perf-profile.children.cycles-pp.__handle_mm_fault
0.84 ± 27% -0.3 0.53 ± 41% perf-profile.children.cycles-pp.handle_mm_fault
0.24 ± 43% -0.2 0.07 ±100% perf-profile.children.cycles-pp.mmap_region
0.27 ± 19% -0.2 0.09 ± 30% perf-profile.children.cycles-pp.setlocale
0.19 ± 26% -0.1 0.04 ±113% perf-profile.children.cycles-pp.pipe_read
0.33 ± 14% -0.1 0.19 ± 38% perf-profile.children.cycles-pp.newidle_balance
0.17 ± 16% -0.1 0.05 ±116% perf-profile.children.cycles-pp.rb_next
0.16 ± 20% -0.1 0.07 ± 61% perf-profile.children.cycles-pp.update_blocked_averages
0.12 ± 32% -0.1 0.05 ±106% perf-profile.children.cycles-pp.fbcon_putcs
0.09 ± 17% -0.1 0.03 ±100% perf-profile.children.cycles-pp.trigger_load_balance
0.11 ± 28% -0.1 0.05 ±106% perf-profile.children.cycles-pp.bit_putcs
0.01 ±173% +0.1 0.13 ± 59% perf-profile.children.cycles-pp.__slab_free
0.04 ±102% +0.4 0.40 ± 80% perf-profile.children.cycles-pp.update_load_avg
0.09 ± 61% +0.7 0.77 ± 85% perf-profile.children.cycles-pp.schedule_idle
0.09 ± 64% +9.5 9.54 ±113% perf-profile.children.cycles-pp.poll_idle
2.81 ± 7% -1.8 1.02 ± 36% perf-profile.self.cycles-pp.native_write_msr
0.69 ± 30% -0.6 0.06 ±116% perf-profile.self.cycles-pp.timerqueue_add
1.23 ± 26% -0.6 0.63 ± 22% perf-profile.self.cycles-pp._raw_spin_lock
0.63 ± 29% -0.6 0.04 ±106% perf-profile.self.cycles-pp.rb_erase
1.41 ± 10% -0.5 0.89 ± 12% perf-profile.self.cycles-pp.read_tsc
1.03 ± 12% -0.5 0.54 ± 43% perf-profile.self.cycles-pp.delay_tsc
0.58 ± 25% -0.4 0.19 ± 64% perf-profile.self.cycles-pp._raw_spin_lock_irq
0.55 ± 23% -0.4 0.16 ± 61% perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
0.61 ± 10% -0.4 0.24 ± 89% perf-profile.self.cycles-pp.timekeeping_max_deferment
1.36 ± 12% -0.4 1.01 ± 11% perf-profile.self.cycles-pp.native_irq_return_iret
0.68 ± 27% -0.3 0.36 ± 51% perf-profile.self.cycles-pp.tick_check_oneshot_broadcast_this_cpu
0.31 ± 15% -0.2 0.15 ± 73% perf-profile.self.cycles-pp.__hrtimer_run_queues
0.18 ± 43% -0.1 0.05 ±110% perf-profile.self.cycles-pp.clockevents_program_event
0.14 ± 26% -0.1 0.05 ±114% perf-profile.self.cycles-pp.rb_next
0.11 ± 30% -0.1 0.03 ±100% perf-profile.self.cycles-pp.__remove_hrtimer
0.11 ± 13% -0.1 0.04 ±103% perf-profile.self.cycles-pp.__note_gp_changes
0.01 ±173% +0.1 0.09 ± 16% perf-profile.self.cycles-pp.tick_nohz_get_sleep_length
0.08 ± 73% +0.1 0.16 ± 7% perf-profile.self.cycles-pp.find_next_bit
0.00 +0.1 0.08 ± 28% perf-profile.self.cycles-pp.cpuidle_enter
0.01 ±173% +0.1 0.13 ± 59% perf-profile.self.cycles-pp.__slab_free
0.08 ± 61% +8.7 8.77 ±113% perf-profile.self.cycles-pp.poll_idle
fxmark.ssd_f2fs_DRBL_1_directio.works
900000 +------------------------------------------------------------------+
| O O |
880000 |-+ O O O O O O O |
860000 |-OO OO OO O OO OO O O OO OO O O O O |
| O O O O OO |
840000 |-+ O O |
820000 |-+ |
| |
800000 |-+ |
780000 |-+ + |
| :+ |
760000 |.+ .++.+. .+ + +.+. +. .+.++.+ + |
740000 |-++.+ .+. + ++ +.+.+ + +.+.+ + + ++ : |
| + + + +.+ + : +.++.|
720000 +------------------------------------------------------------------+
fxmark.ssd_f2fs_DRBL_1_directio.works_sec
30000 +-------------------------------------------------------------------+
| O O |
29000 |-+ O O O O O O O |
| OO OO OO O O OO OO O O O OO OO O O |
| O O O O O O |
28000 |-+ O O |
| |
27000 |-+ |
| |
26000 |-+ + |
| :+ |
|. .+.++. .+. +. .+.+ .+. .++.+.+ + |
25000 |-++.+ .+. + ++ ++.+. : ++.+. + + ++ : |
| + + + ++. : : +.++.|
24000 +-------------------------------------------------------------------+
fxmark.ssd_f2fs_DRBL_1_directio.iowait_sec
22 +----------------------------------------------------------------------+
| OO O O |
21 |-OO O OO O OO O OO O OO O OO O O O O O O O OO O OO O OO |
| O |
| |
20 |-+ |
| |
19 |-+ |
| |
18 |-+ |
| .+.+.+ |
|. .+.++.+.++.+. +. .+.++.+.++.+.++ : |
17 |-++.+.++.+ ++.+.+ +.+.++.+. + :.+.+ .|
| + + + |
16 +----------------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.6.0-rc6-00081-g90c91dfb86d0f" of type "text/plain" (204871 bytes)
View attachment "job-script" of type "text/plain" (7761 bytes)
View attachment "job.yaml" of type "text/plain" (5428 bytes)
View attachment "reproduce" of type "text/plain" (254 bytes)
Powered by blists - more mailing lists