[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202301301612.67c70c9-yujie.liu@intel.com>
Date: Mon, 30 Jan 2023 17:41:46 +0800
From: kernel test robot <yujie.liu@...el.com>
To: Kishon Vijay Abraham I <kvijayab@....com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>, <x86@...nel.org>,
Borislav Petkov <bp@...en8.de>, Leo Duran <leo.duran@....com>,
Zhang Rui <rui.zhang@...el.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
<linux-pm@...r.kernel.org>, <ying.huang@...el.com>,
<feng.tang@...el.com>, <zhengjun.xing@...ux.intel.com>,
<fengwei.yin@...el.com>
Subject: [tip:x86/boot] [x86/acpi/boot] e2869bd7af:
stress-ng.uprobe.ops_per_sec 29.4% improvement
Greeting,
FYI, we noticed a 29.4% improvement of stress-ng.uprobe.ops_per_sec due to commit:
commit: e2869bd7af608c343988429ceb1c2fe99644a01f ("x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC")
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git x86/boot
in testcase: stress-ng
on test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
with following parameters:
nr_threads: 100%
testtime: 60s
class: cpu
test: uprobe
cpufreq_governor: performance
Details are as below:
=========================================================================================
class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
cpu/gcc-11/performance/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/uprobe/stress-ng/60s
commit:
5353fff29e ("scripts/head-object-list: Remove x86 from the list")
e2869bd7af ("x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC")
5353fff29e42d0ef e2869bd7af608c343988429ceb1
---------------- ---------------------------
%stddev %change %stddev
\ | \
2628 -1.1% 2598 stress-ng.time.system_time
217951 +29.4% 281951 stress-ng.uprobe.ops
3562 +29.4% 4611 stress-ng.uprobe.ops_per_sec
0.33 ± 6% +0.0 0.37 ± 2% mpstat.cpu.all.usr%
12814 -14.9% 10907 meminfo.KernelStack
77368 -72.7% 21099 meminfo.Percpu
188037 -19.0% 152278 meminfo.VmallocUsed
69774 ± 9% +961.8% 740875 ±149% numa-meminfo.node0.FilePages
66328 ± 10% +1012.1% 737664 ±149% numa-meminfo.node0.Unevictable
6155 ± 7% -17.8% 5058 ± 11% numa-meminfo.node1.KernelStack
17443 ± 9% +961.8% 185218 ±149% numa-vmstat.node0.nr_file_pages
16581 ± 10% +1012.2% 184416 ±149% numa-vmstat.node0.nr_unevictable
16581 ± 10% +1012.2% 184416 ±149% numa-vmstat.node0.nr_zone_unevictable
6154 ± 7% -17.8% 5058 ± 11% numa-vmstat.node1.nr_kernel_stack
12821 -14.9% 10911 proc-vmstat.nr_kernel_stack
23295 -3.3% 22529 proc-vmstat.nr_slab_reclaimable
26823 -6.5% 25084 proc-vmstat.nr_slab_unreclaimable
296402 -1.9% 290725 proc-vmstat.numa_hit
52723 -1.3% 52059 proc-vmstat.numa_other
271549 ± 2% -2.9% 263567 proc-vmstat.pgfault
0.45 ± 3% -15.9% 0.38 ± 12% sched_debug.cfs_rq:/.h_nr_running.stddev
20849 ± 13% -22.6% 16144 ± 6% sched_debug.cfs_rq:/.min_vruntime.stddev
0.43 ± 3% -17.4% 0.35 ± 19% sched_debug.cfs_rq:/.nr_running.stddev
20867 ± 13% -22.6% 16151 ± 6% sched_debug.cfs_rq:/.spread0.stddev
287.77 ± 2% -20.9% 227.68 ± 10% sched_debug.cfs_rq:/.util_est_enqueued.stddev
1554 ± 4% -17.3% 1285 ± 19% sched_debug.cpu.curr->pid.stddev
0.45 ± 2% -14.6% 0.38 ± 12% sched_debug.cpu.nr_running.stddev
12671977 +8.9% 13802380 perf-stat.i.branch-misses
2731612 ± 11% +16.3% 3176876 ± 4% perf-stat.i.cache-misses
23841254 ± 4% +16.4% 27741896 perf-stat.i.cache-references
71372 ± 6% -22.2% 55528 ± 5% perf-stat.i.cycles-between-cache-misses
340368 ± 7% +23.4% 419908 ± 10% perf-stat.i.dTLB-store-misses
4.889e+08 ± 3% +13.0% 5.527e+08 perf-stat.i.dTLB-stores
26279 ± 6% +15.0% 30234 ± 3% perf-stat.i.iTLB-loads
240619 ± 3% -7.8% 221882 ± 5% perf-stat.i.instructions-per-iTLB-miss
521.14 ± 3% +18.0% 615.09 ± 2% perf-stat.i.metric.K/sec
885932 ± 17% +33.7% 1184229 ± 4% perf-stat.i.node-load-misses
997827 ± 17% +33.1% 1328416 ± 4% perf-stat.i.node-loads
474784 ± 10% +32.0% 626649 ± 5% perf-stat.i.node-store-misses
651001 ± 11% +28.9% 839098 ± 4% perf-stat.i.node-stores
0.35 ± 6% +14.1% 0.40 ± 2% perf-stat.overall.MPKI
94.34 -1.0 93.35 perf-stat.overall.iTLB-load-miss-rate%
12473442 +8.8% 13570687 perf-stat.ps.branch-misses
2690267 ± 11% +16.2% 3125339 ± 4% perf-stat.ps.cache-misses
23486253 ± 4% +16.3% 27310650 perf-stat.ps.cache-references
335215 ± 7% +23.3% 413395 ± 10% perf-stat.ps.dTLB-store-misses
4.817e+08 ± 3% +13.0% 5.442e+08 perf-stat.ps.dTLB-stores
25873 ± 6% +15.0% 29754 ± 3% perf-stat.ps.iTLB-loads
873032 ± 17% +33.5% 1165539 ± 4% perf-stat.ps.node-load-misses
983400 ± 17% +33.0% 1307445 ± 4% perf-stat.ps.node-loads
467618 ± 10% +31.9% 616759 ± 5% perf-stat.ps.node-store-misses
641166 ± 11% +28.8% 825704 ± 4% perf-stat.ps.node-stores
1.44 -0.2 1.23 perf-profile.calltrace.cycles-pp.trace_find_next_entry_inc.tracing_read_pipe.vfs_read.ksys_read.do_syscall_64
1.41 -0.2 1.22 perf-profile.calltrace.cycles-pp.__find_next_entry.trace_find_next_entry_inc.tracing_read_pipe.vfs_read.ksys_read
0.94 +0.1 1.02 ± 2% perf-profile.calltrace.cycles-pp.ring_buffer_empty_cpu.__find_next_entry.trace_find_next_entry_inc.tracing_read_pipe.vfs_read
0.00 +0.6 0.55 ± 6% perf-profile.calltrace.cycles-pp.tracing_wait_pipe.tracing_read_pipe.vfs_read.ksys_read.do_syscall_64
0.00 +0.6 0.57 perf-profile.calltrace.cycles-pp.trace_print_context.print_trace_fmt.tracing_read_pipe.vfs_read.ksys_read
0.00 +0.6 0.59 ± 2% perf-profile.calltrace.cycles-pp.print_trace_fmt.tracing_read_pipe.vfs_read.ksys_read.do_syscall_64
0.30 ± 2% -0.2 0.08 perf-profile.children.cycles-pp._find_next_bit
1.44 -0.2 1.23 perf-profile.children.cycles-pp.trace_find_next_entry_inc
1.44 -0.2 1.23 perf-profile.children.cycles-pp.__find_next_entry
0.07 ± 5% +0.0 0.10 ± 5% perf-profile.children.cycles-pp.memcpy_erms
0.08 ± 5% +0.0 0.11 ± 6% perf-profile.children.cycles-pp.ring_buffer_empty
0.10 ± 5% +0.0 0.12 ± 4% perf-profile.children.cycles-pp.trace_print_lat_fmt
0.11 ± 7% +0.0 0.14 ± 6% perf-profile.children.cycles-pp.number
0.04 ± 57% +0.0 0.07 perf-profile.children.cycles-pp.trace_event_buffer_reserve
0.02 ±100% +0.0 0.06 perf-profile.children.cycles-pp.trace_event_buffer_lock_reserve
0.15 ± 5% +0.0 0.18 ± 6% perf-profile.children.cycles-pp.print_uprobe_event
0.02 ±100% +0.0 0.06 ± 6% perf-profile.children.cycles-pp.ring_buffer_peek
0.04 ± 58% +0.0 0.08 ± 8% perf-profile.children.cycles-pp.peek_next_entry
0.09 ± 7% +0.0 0.13 ± 12% perf-profile.children.cycles-pp.finish_wait
0.01 ±173% +0.0 0.06 ± 7% perf-profile.children.cycles-pp.__select
0.15 ± 4% +0.0 0.20 ± 2% perf-profile.children.cycles-pp.format_decode
0.10 ± 4% +0.0 0.14 ± 7% perf-profile.children.cycles-pp.__uprobe_trace_func
0.00 +0.1 0.05 perf-profile.children.cycles-pp.ring_buffer_lock_reserve
0.12 ± 5% +0.1 0.17 ± 8% perf-profile.children.cycles-pp.prepare_to_wait
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.rb_buffer_peek
0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.trace_event_buffer_commit
0.14 ± 3% +0.1 0.20 ± 8% perf-profile.children.cycles-pp.handler_chain
0.14 ± 3% +0.1 0.20 ± 6% perf-profile.children.cycles-pp.uprobe_dispatcher
0.16 ± 3% +0.1 0.22 ± 6% perf-profile.children.cycles-pp.exit_to_user_mode_prepare
0.16 ± 3% +0.1 0.22 ± 5% perf-profile.children.cycles-pp.irqentry_exit_to_user_mode
0.15 +0.1 0.21 ± 7% perf-profile.children.cycles-pp.exit_to_user_mode_loop
0.15 ± 2% +0.1 0.22 ± 6% perf-profile.children.cycles-pp.asm_exc_int3
0.14 ± 3% +0.1 0.21 ± 7% perf-profile.children.cycles-pp.uprobe_notify_resume
0.18 ± 8% +0.1 0.26 ± 3% perf-profile.children.cycles-pp.trace_empty
0.14 ± 3% +0.1 0.21 ± 6% perf-profile.children.cycles-pp.rb_set_head_page
0.16 ± 3% +0.1 0.23 ± 5% perf-profile.children.cycles-pp.__getpid
0.22 ± 6% +0.1 0.31 ± 8% perf-profile.children.cycles-pp._raw_spin_lock_irqsave
0.29 ± 3% +0.1 0.41 ± 6% perf-profile.children.cycles-pp.ring_buffer_wait
0.30 ± 3% +0.1 0.42 ± 4% perf-profile.children.cycles-pp.rb_per_cpu_empty
0.44 ± 2% +0.1 0.57 ± 2% perf-profile.children.cycles-pp.trace_print_context
0.48 ± 3% +0.1 0.61 perf-profile.children.cycles-pp.vsnprintf
0.46 ± 3% +0.1 0.59 ± 2% perf-profile.children.cycles-pp.print_trace_fmt
1.13 +0.1 1.26 perf-profile.children.cycles-pp.ring_buffer_empty_cpu
0.48 ± 3% +0.1 0.62 perf-profile.children.cycles-pp.seq_buf_vprintf
0.33 ± 5% +0.1 0.47 ± 7% perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
0.51 ± 2% +0.1 0.66 perf-profile.children.cycles-pp.trace_seq_printf
0.40 ± 3% +0.2 0.55 ± 6% perf-profile.children.cycles-pp.tracing_wait_pipe
0.51 ± 3% +0.2 0.68 perf-profile.children.cycles-pp._raw_spin_lock
0.29 ± 2% -0.2 0.08 ± 5% perf-profile.self.cycles-pp._find_next_bit
0.39 ± 3% -0.1 0.28 perf-profile.self.cycles-pp.ring_buffer_empty_cpu
0.16 ± 2% -0.1 0.07 ± 11% perf-profile.self.cycles-pp.__find_next_entry
0.07 ± 5% +0.0 0.10 ± 5% perf-profile.self.cycles-pp.memcpy_erms
0.12 ± 3% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.vsnprintf
0.09 ± 8% +0.0 0.12 ± 3% perf-profile.self.cycles-pp.number
0.14 ± 3% +0.0 0.17 ± 2% perf-profile.self.cycles-pp.format_decode
0.17 ± 4% +0.0 0.22 ± 3% perf-profile.self.cycles-pp.rb_per_cpu_empty
0.13 ± 3% +0.1 0.20 ± 8% perf-profile.self.cycles-pp.rb_set_head_page
0.39 ± 3% +0.1 0.51 perf-profile.self.cycles-pp._raw_spin_lock
0.33 ± 6% +0.1 0.47 ± 7% perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
View attachment "config-6.2.0-rc3-00003-ge2869bd7af60" of type "text/plain" (166944 bytes)
View attachment "job-script" of type "text/plain" (8055 bytes)
View attachment "job.yaml" of type "text/plain" (5554 bytes)
View attachment "reproduce" of type "text/plain" (339 bytes)
Powered by blists - more mailing lists