[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202504241621.f27743ec-lkp@intel.com>
Date: Thu, 24 Apr 2025 16:39:43 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Kairui Song <kasong@...cent.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>, Chris Li <chrisl@...nel.org>,
"Huang, Ying" <ying.huang@...ux.alibaba.com>, Baoquan He <bhe@...hat.com>,
Barry Song <v-songbaohua@...o.com>, Hugh Dickens <hughd@...gle.com>,
"Johannes Weiner" <hannes@...xchg.org>, Kalesh Singh
<kaleshsingh@...gle.com>, Nhat Pham <nphamcs@...il.com>, Ryan Roberts
<ryan.roberts@....com>, Yosry Ahmed <yosryahmed@...gle.com>,
<linux-mm@...ck.org>, <oliver.sang@...el.com>
Subject: [linus:master] [mm, swap] 7277433096: swapin.throughput 33.0%
regression
Hello,
note:
from commit message, this regression should be expected. we still make out this
report FYI what's the possible impact of this change.
below details just FYI.
kernel test robot noticed a 33.0% regression of swapin.throughput on:
commit: 7277433096f6ce4a84a1620529ac4ba3e1041ee1 ("mm, swap: remove old allocation path for HDD")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[still regression on linus/master bc3372351d0c8b2726b7d4229b878342e3e6b0e8]
[still regression on linux-next/master 6ac908f24cd7ddae52c496bbc888e97ee7b033ac]
testcase: swapin
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 16 threads 1 sockets Intel(R) Xeon(R) E-2278G CPU @ 3.40GHz (Coffee Lake-E) with 32G memory
parameters:
disk: 1HDD
size: 8G
nr_task: 8
cpufreq_governor: performance
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202504241621.f27743ec-lkp@intel.com
Details are as below:
-------------------------------------------------------------------------------------------------->
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250424/202504241621.f27743ec-lkp@intel.com
=========================================================================================
compiler/cpufreq_governor/disk/kconfig/nr_task/rootfs/size/tbox_group/testcase:
gcc-12/performance/1HDD/x86_64-rhel-9.4/8/debian-12-x86_64-20240206.cgz/8G/lkp-cfl-e1/swapin
commit:
e027ec414f ("mm, swap: fold swap_info_get_cont in the only caller")
7277433096 ("mm, swap: remove old allocation path for HDD")
e027ec414fe8f540 7277433096f6ce4a84a1620529a
---------------- ---------------------------
%stddev %change %stddev
\ | \
5.409e+09 +51.8% 8.212e+09 ± 5% cpuidle..time
1037299 +18.5% 1228926 ± 12% cpuidle..usage
68.75 +9.2% 75.07 ± 2% iostat.cpu.idle
30.53 ± 2% -19.8% 24.47 ± 8% iostat.cpu.iowait
797.42 ± 4% +22.0% 972.57 ± 7% uptime.boot
9703 ± 4% +26.4% 12265 ± 8% uptime.idle
20243 ± 57% +248.6% 70559 ± 30% meminfo.Inactive
20238 ± 57% +248.6% 70559 ± 30% meminfo.Inactive(anon)
24668 +11.3% 27444 meminfo.Shmem
0.01 ± 14% +33.3% 0.02 ± 5% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
7023 ± 3% -13.1% 6105 ± 8% perf-sched.total_wait_and_delay.count.ms
4615 ± 4% -18.8% 3747 ± 12% perf-sched.wait_and_delay.count.io_schedule.folio_wait_bit_common.__folio_lock_or_retry.do_swap_page
68.63 +6.4 75.02 ± 2% mpstat.cpu.all.idle%
30.66 ± 2% -6.1 24.54 ± 8% mpstat.cpu.all.iowait%
0.09 -0.0 0.06 ± 4% mpstat.cpu.all.irq%
0.05 -0.0 0.04 ± 3% mpstat.cpu.all.soft%
0.28 -0.1 0.20 ± 5% mpstat.cpu.all.sys%
0.29 ± 21% -0.1 0.15 ± 4% mpstat.cpu.all.usr%
0.09 ± 12% -8.7% 0.08 swapin.free_time
3435 ± 2% -33.7% 2277 ± 5% swapin.median
47.13 ± 6% +8.4 55.50 ± 7% swapin.stddev%
26855 ± 2% -33.0% 18001 ± 5% swapin.throughput
339.27 +51.6% 514.32 ± 5% swapin.time.elapsed_time
339.27 +51.6% 514.32 ± 5% swapin.time.elapsed_time.max
12785 +51.4% 19352 ± 5% swapin.time.minor_page_faults
68.81 +9.1% 75.06 ± 2% vmstat.cpu.id
30.57 ± 2% -19.6% 24.56 ± 8% vmstat.cpu.wa
24857 -33.8% 16465 ± 5% vmstat.io.bi
1.30 ± 5% -7.1% 1.21 ± 5% vmstat.procs.r
24857 -33.8% 16465 ± 5% vmstat.swap.si
7.12 ± 7% -31.6% 4.87 ± 14% vmstat.swap.so
2401 -28.3% 1721 ± 3% vmstat.system.cs
2399 -24.5% 1811 ± 2% vmstat.system.in
936464 +6.4% 996818 proc-vmstat.nr_active_anon
922127 +8.3% 998395 proc-vmstat.nr_anon_pages
612143 -1.2% 605002 proc-vmstat.nr_dirty_background_threshold
1225784 -1.2% 1211485 proc-vmstat.nr_dirty_threshold
1803299 +3.7% 1869638 proc-vmstat.nr_file_pages
6195937 -1.2% 6124422 proc-vmstat.nr_free_pages
5059 ± 57% +248.6% 17639 ± 30% proc-vmstat.nr_inactive_anon
5486 -1.4% 5410 proc-vmstat.nr_page_table_pages
6172 +11.2% 6866 proc-vmstat.nr_shmem
14607 +5.3% 15384 proc-vmstat.nr_slab_unreclaimable
920671 +7.1% 986314 ± 2% proc-vmstat.nr_swapcached
4413040 +1.8% 4493858 proc-vmstat.nr_vmscan_write
936464 +6.4% 996818 proc-vmstat.nr_zone_active_anon
5059 ± 57% +248.6% 17639 ± 30% proc-vmstat.nr_zone_inactive_anon
2535302 +5.8% 2682756 proc-vmstat.numa_hit
2535395 +5.8% 2681644 proc-vmstat.numa_local
2543613 +5.9% 2694193 proc-vmstat.pgalloc_normal
2629484 +8.0% 2838539 proc-vmstat.pgfault
2520197 +6.0% 2672439 proc-vmstat.pgfree
293008 -6.8% 273047 proc-vmstat.pgmajfault
25587 ± 2% +41.3% 36143 ± 5% proc-vmstat.pgreuse
933820 +8.8% 1015820 proc-vmstat.swap_ra
919094 +9.8% 1008935 proc-vmstat.swap_ra_hit
16587 -4.3% 15880 proc-vmstat.workingset_nodes
1088601 +7.4% 1169575 proc-vmstat.workingset_refault_anon
2.328e+08 ± 24% -43.2% 1.323e+08 ± 30% sched_debug.cfs_rq:/.avg_vruntime.max
62177217 ± 36% -43.0% 35460373 ± 30% sched_debug.cfs_rq:/.avg_vruntime.stddev
2.328e+08 ± 24% -43.2% 1.323e+08 ± 30% sched_debug.cfs_rq:/.min_vruntime.max
62177217 ± 36% -43.0% 35460373 ± 30% sched_debug.cfs_rq:/.min_vruntime.stddev
210.52 ± 63% +130.6% 485.44 ± 32% sched_debug.cfs_rq:/.removed.load_avg.max
4.68 ± 79% +185.3% 13.36 ± 37% sched_debug.cfs_rq:/.removed.runnable_avg.avg
69.33 ± 72% +195.8% 205.12 ± 38% sched_debug.cfs_rq:/.removed.runnable_avg.max
17.15 ± 74% +192.5% 50.18 ± 38% sched_debug.cfs_rq:/.removed.runnable_avg.stddev
4.58 ± 84% +188.0% 13.19 ± 37% sched_debug.cfs_rq:/.removed.util_avg.avg
67.72 ± 77% +200.5% 203.49 ± 37% sched_debug.cfs_rq:/.removed.util_avg.max
16.76 ± 79% +196.5% 49.70 ± 37% sched_debug.cfs_rq:/.removed.util_avg.stddev
35.31 ± 24% -40.4% 21.03 ± 31% sched_debug.cfs_rq:/.runnable_avg.min
167.49 ± 2% -23.7% 127.74 ± 12% sched_debug.cfs_rq:/.util_avg.avg
602113 ± 5% +16.6% 702121 ± 8% sched_debug.cpu.clock.avg
602114 ± 5% +16.6% 702121 ± 8% sched_debug.cpu.clock.max
602112 ± 5% +16.6% 702120 ± 8% sched_debug.cpu.clock.min
601358 ± 5% +16.6% 701428 ± 8% sched_debug.cpu.clock_task.avg
601962 ± 5% +16.6% 701955 ± 8% sched_debug.cpu.clock_task.max
597107 ± 5% +16.9% 698032 ± 8% sched_debug.cpu.clock_task.min
1227 ± 5% -16.4% 1026 ± 10% sched_debug.cpu.clock_task.stddev
4030 ± 3% +33.1% 5364 ± 2% sched_debug.cpu.curr->pid.max
1037 ± 5% +28.9% 1336 ± 3% sched_debug.cpu.curr->pid.stddev
46279 ± 5% -19.9% 37088 ± 2% sched_debug.cpu.nr_switches.avg
602113 ± 5% +16.6% 702121 ± 8% sched_debug.cpu_clk
601390 ± 5% +16.6% 701418 ± 8% sched_debug.ktime
602514 ± 5% +16.6% 702513 ± 8% sched_debug.sched_clk
50.92 ± 3% +13.3% 57.67 ± 2% perf-stat.i.MPKI
96852867 ± 12% -41.7% 56470262 ± 4% perf-stat.i.branch-instructions
3.59 ± 2% -0.2 3.39 perf-stat.i.branch-miss-rate%
4517791 ± 19% -46.8% 2401426 ± 4% perf-stat.i.branch-misses
45.08 +3.6 48.70 perf-stat.i.cache-miss-rate%
11444503 -25.3% 8552014 ± 5% perf-stat.i.cache-misses
26646448 -30.4% 18535433 ± 3% perf-stat.i.cache-references
2380 -28.5% 1703 ± 3% perf-stat.i.context-switches
1.99 ± 2% +4.4% 2.08 perf-stat.i.cpi
6.453e+08 ± 7% -36.1% 4.124e+08 ± 4% perf-stat.i.cpu-cycles
37.44 ± 3% -29.0% 26.60 ± 3% perf-stat.i.cpu-migrations
74.34 ± 4% -18.6% 60.51 ± 4% perf-stat.i.cycles-between-cache-misses
4.335e+08 ± 14% -42.5% 2.493e+08 ± 4% perf-stat.i.instructions
0.56 ± 3% -7.2% 0.52 perf-stat.i.ipc
865.19 -38.4% 532.92 ± 5% perf-stat.i.major-faults
6605 -27.4% 4795 ± 3% perf-stat.i.minor-faults
7470 -28.7% 5328 ± 4% perf-stat.i.page-faults
4.62 ± 6% -1.1 3.55 ± 44% perf-stat.overall.branch-miss-rate%
56.43 ± 6% -28.5% 40.36 ± 44% perf-stat.overall.cycles-between-cache-misses
0.67 ± 6% -24.7% 0.50 ± 44% perf-stat.overall.ipc
96637819 ± 12% -51.6% 46743598 ± 45% perf-stat.ps.branch-instructions
4506778 ± 19% -55.9% 1988844 ± 45% perf-stat.ps.branch-misses
11410417 -38.1% 7059671 ± 45% perf-stat.ps.cache-misses
26570647 -42.3% 15341807 ± 44% perf-stat.ps.cache-references
2373 -40.5% 1411 ± 44% perf-stat.ps.context-switches
6.442e+08 ± 7% -47.0% 3.417e+08 ± 44% perf-stat.ps.cpu-cycles
37.34 ± 3% -41.0% 22.03 ± 44% perf-stat.ps.cpu-migrations
4.326e+08 ± 14% -52.3% 2.064e+08 ± 45% perf-stat.ps.instructions
862.62 -49.0% 440.19 ± 45% perf-stat.ps.major-faults
6586 -39.8% 3967 ± 44% perf-stat.ps.minor-faults
7449 -40.8% 4407 ± 44% perf-stat.ps.page-faults
6.51 ± 7% -1.2 5.26 ± 5% perf-profile.calltrace.cycles-pp.do_access
5.54 ± 6% -1.0 4.54 ± 5% perf-profile.calltrace.cycles-pp.asm_exc_page_fault.do_access
4.94 ± 5% -0.9 4.03 ± 6% perf-profile.calltrace.cycles-pp.exc_page_fault.asm_exc_page_fault.do_access
4.90 ± 5% -0.9 4.02 ± 6% perf-profile.calltrace.cycles-pp.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
4.62 ± 6% -0.9 3.74 ± 7% perf-profile.calltrace.cycles-pp.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault.do_access
4.46 ± 6% -0.8 3.62 ± 7% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
4.33 ± 7% -0.8 3.54 ± 8% perf-profile.calltrace.cycles-pp.do_swap_page.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault
2.25 ± 11% -0.4 1.86 ± 10% perf-profile.calltrace.cycles-pp.handle_edge_irq.__common_interrupt.common_interrupt.asm_common_interrupt.cpuidle_enter_state
2.26 ± 11% -0.4 1.87 ± 10% perf-profile.calltrace.cycles-pp.__common_interrupt.common_interrupt.asm_common_interrupt.cpuidle_enter_state.cpuidle_enter
2.20 ± 12% -0.4 1.83 ± 10% perf-profile.calltrace.cycles-pp.handle_irq_event.handle_edge_irq.__common_interrupt.common_interrupt.asm_common_interrupt
0.82 ± 13% -0.1 0.69 ± 7% perf-profile.calltrace.cycles-pp.ahci_handle_port_intr.ahci_single_level_irq_intr.__handle_irq_event_percpu.handle_irq_event.handle_edge_irq
4.14 ± 3% +0.2 4.39 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init.start_kernel
4.22 ± 3% +0.3 4.49 perf-profile.calltrace.cycles-pp.cpu_startup_entry.rest_init.start_kernel.x86_64_start_reservations.x86_64_start_kernel
4.22 ± 3% +0.3 4.49 perf-profile.calltrace.cycles-pp.rest_init.start_kernel.x86_64_start_reservations.x86_64_start_kernel.common_startup_64
4.22 ± 3% +0.3 4.49 perf-profile.calltrace.cycles-pp.start_kernel.x86_64_start_reservations.x86_64_start_kernel.common_startup_64
4.22 ± 3% +0.3 4.49 perf-profile.calltrace.cycles-pp.x86_64_start_kernel.common_startup_64
4.22 ± 3% +0.3 4.49 perf-profile.calltrace.cycles-pp.x86_64_start_reservations.x86_64_start_kernel.common_startup_64
4.22 ± 3% +0.3 4.49 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.rest_init.start_kernel.x86_64_start_reservations
3.73 ± 5% +0.3 4.05 ± 2% perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.rest_init
83.48 +1.2 84.68 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.common_startup_64
83.50 +1.2 84.74 perf-profile.calltrace.cycles-pp.start_secondary.common_startup_64
83.50 +1.2 84.74 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.common_startup_64
87.73 +1.5 89.22 perf-profile.calltrace.cycles-pp.common_startup_64
7.69 ± 7% -1.4 6.28 ± 6% perf-profile.children.cycles-pp.do_access
6.07 ± 5% -1.0 5.08 ± 3% perf-profile.children.cycles-pp.asm_exc_page_fault
5.40 ± 5% -0.9 4.52 ± 3% perf-profile.children.cycles-pp.exc_page_fault
5.37 ± 5% -0.9 4.50 ± 3% perf-profile.children.cycles-pp.do_user_addr_fault
4.44 ± 7% -0.8 3.60 ± 7% perf-profile.children.cycles-pp.do_swap_page
5.05 ± 6% -0.8 4.22 ± 4% perf-profile.children.cycles-pp.handle_mm_fault
4.88 ± 6% -0.8 4.07 ± 5% perf-profile.children.cycles-pp.__handle_mm_fault
3.56 ± 6% -0.6 2.95 ± 7% perf-profile.children.cycles-pp.common_interrupt
3.56 ± 6% -0.6 2.96 ± 7% perf-profile.children.cycles-pp.asm_common_interrupt
2.48 ± 11% -0.5 1.97 ± 10% perf-profile.children.cycles-pp.handle_edge_irq
2.49 ± 11% -0.5 1.98 ± 10% perf-profile.children.cycles-pp.__common_interrupt
2.42 ± 11% -0.5 1.93 ± 10% perf-profile.children.cycles-pp.handle_irq_event
2.39 ± 12% -0.5 1.91 ± 9% perf-profile.children.cycles-pp.__handle_irq_event_percpu
2.37 ± 13% -0.5 1.90 ± 9% perf-profile.children.cycles-pp.ahci_single_level_irq_intr
1.17 ± 12% -0.3 0.90 ± 11% perf-profile.children.cycles-pp.__schedule
0.74 ± 8% -0.2 0.49 ± 14% perf-profile.children.cycles-pp.__folio_lock_or_retry
0.78 ± 13% -0.2 0.55 ± 12% perf-profile.children.cycles-pp.schedule
0.64 ± 9% -0.2 0.43 ± 15% perf-profile.children.cycles-pp.io_schedule
0.68 ± 7% -0.2 0.47 ± 15% perf-profile.children.cycles-pp.folio_wait_bit_common
0.93 ± 12% -0.2 0.74 ± 5% perf-profile.children.cycles-pp.ahci_handle_port_intr
0.89 ± 9% -0.2 0.74 ± 7% perf-profile.children.cycles-pp.scsi_end_request
0.89 ± 9% -0.1 0.74 ± 8% perf-profile.children.cycles-pp.scsi_io_completion
0.44 ± 21% -0.1 0.30 ± 13% perf-profile.children.cycles-pp.blk_mq_dispatch_rq_list
0.20 ± 20% -0.1 0.09 ± 33% perf-profile.children.cycles-pp.xas_load
0.20 ± 24% -0.1 0.10 ± 29% perf-profile.children.cycles-pp.filemap_get_entry
0.25 ± 14% -0.1 0.16 ± 19% perf-profile.children.cycles-pp.ahci_handle_port_interrupt
0.24 ± 12% -0.1 0.15 ± 17% perf-profile.children.cycles-pp.sata_async_notification
0.23 ± 12% -0.1 0.14 ± 17% perf-profile.children.cycles-pp.ahci_scr_read
0.20 ± 21% -0.1 0.11 ± 25% perf-profile.children.cycles-pp.ata_scsi_queuecmd
0.20 ± 24% -0.1 0.12 ± 23% perf-profile.children.cycles-pp.scsi_dispatch_cmd
0.22 ± 22% -0.1 0.14 ± 27% perf-profile.children.cycles-pp.seq_read_iter
0.21 ± 11% +0.1 0.29 ± 19% perf-profile.children.cycles-pp.__mmput
0.17 ± 15% +0.1 0.26 ± 15% perf-profile.children.cycles-pp.do_pte_missing
0.14 ± 20% +0.1 0.23 ± 20% perf-profile.children.cycles-pp.filemap_map_pages
0.20 ± 13% +0.1 0.29 ± 19% perf-profile.children.cycles-pp.exit_mmap
0.20 ± 18% +0.1 0.29 ± 14% perf-profile.children.cycles-pp.leave_mm
0.15 ± 18% +0.1 0.24 ± 23% perf-profile.children.cycles-pp.do_read_fault
0.24 ± 25% +0.1 0.39 ± 19% perf-profile.children.cycles-pp.rb_next
0.40 ± 35% +0.2 0.57 ± 22% perf-profile.children.cycles-pp.ret_from_fork_asm
0.63 ± 11% +0.2 0.86 ± 12% perf-profile.children.cycles-pp.__hrtimer_next_event_base
4.22 ± 3% +0.3 4.49 perf-profile.children.cycles-pp.rest_init
4.22 ± 3% +0.3 4.49 perf-profile.children.cycles-pp.start_kernel
4.22 ± 3% +0.3 4.49 perf-profile.children.cycles-pp.x86_64_start_kernel
4.22 ± 3% +0.3 4.49 perf-profile.children.cycles-pp.x86_64_start_reservations
83.50 +1.2 84.74 perf-profile.children.cycles-pp.start_secondary
85.97 +1.4 87.32 perf-profile.children.cycles-pp.cpuidle_idle_call
87.73 +1.5 89.22 perf-profile.children.cycles-pp.common_startup_64
87.73 +1.5 89.22 perf-profile.children.cycles-pp.cpu_startup_entry
87.72 +1.5 89.22 perf-profile.children.cycles-pp.do_idle
2.06 ± 6% -0.3 1.71 ± 12% perf-profile.self.cycles-pp.do_access
0.23 ± 12% -0.1 0.14 ± 17% perf-profile.self.cycles-pp.ahci_scr_read
0.12 ± 29% -0.1 0.04 ±101% perf-profile.self.cycles-pp.xas_load
0.22 ± 17% +0.1 0.28 ± 12% perf-profile.self.cycles-pp.update_rq_clock_task
0.24 ± 26% +0.1 0.38 ± 19% perf-profile.self.cycles-pp.rb_next
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Powered by blists - more mailing lists