lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202508251419.d8117ed7-lkp@intel.com>
Date: Mon, 25 Aug 2025 15:11:32 +0800
From: kernel test robot <oliver.sang@...el.com>
To: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Christian Loehle <christian.loehle@....com>, <linux-pm@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: [linus:master] [cpuidle]  779b1a1cb1:
 perf-bench-sched-pipe.ops_per_sec 12.9% regression



Hello,

kernel test robot noticed a 12.9% regression of perf-bench-sched-pipe.ops_per_sec on:


commit: 779b1a1cb13ae17028aeddb2fbbdba97357a1e15 ("cpuidle: governors: menu: Avoid selecting states with too much latency")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on      linus/master 1b237f190eb3d36f52dffe07a40b5eb210280e00]
[still regression on linux-next/master 0f4c93f7eb861acab537dbe94441817a270537bf]

testcase: perf-bench-sched-pipe
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz (Cascade Lake) with 176G memory
parameters:

	loops: 10000000ops
	mode: threads
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+-----------------------------------------------------------------------------+
| testcase: change | qperf: qperf.udp.latency 8.1% regression                                    |
| test machine     | 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 16G memory |
| test parameters  | cluster=cs-localhost                                                        |
|                  | cpufreq_governor=performance                                                |
|                  | runtime=600s                                                                |
+------------------+-----------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202508251419.d8117ed7-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250825/202508251419.d8117ed7-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/loops/mode/rootfs/tbox_group/testcase:
  gcc-12/performance/x86_64-rhel-9.4/10000000ops/threads/debian-12-x86_64-20240206.cgz/lkp-csl-2sp10/perf-bench-sched-pipe

commit: 
  v6.17-rc2
  779b1a1cb1 ("cpuidle: governors: menu: Avoid selecting states with too much latency")

       v6.17-rc2 779b1a1cb13ae17028aeddb2fbb 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 9.124e+09           +14.6%  1.045e+10        cpuidle..time
     32.00 ± 23%     +37.5%      44.00 ± 10%  perf-c2c.DRAM.remote
      0.01 ± 22%     +55.3%       0.01 ± 18%  perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
    805462           -12.0%     708615        vmstat.system.cs
     47.14 ±  4%      +8.9%      51.35 ±  5%  boot-time.boot
      8181 ±  5%      +9.4%       8955 ±  5%  boot-time.idle
     43305 ±  9%     +17.8%      51027 ±  6%  meminfo.AnonHugePages
    284445           +20.6%     343101        meminfo.Shmem
      0.50            -0.1        0.43        mpstat.cpu.all.sys%
      0.45 ±  2%      -0.1        0.40 ±  2%  mpstat.cpu.all.usr%
     70114 ± 40%     -74.6%      17817 ±146%  numa-vmstat.node3.nr_anon_pages
     68448 ±  3%     +22.0%      83539        numa-vmstat.node3.nr_shmem
     96.76 ±  2%     +11.5%     107.92 ±  2%  uptime.boot
     17588 ±  2%     +12.0%      19695 ±  3%  uptime.idle
    280480 ± 40%     -74.6%      71234 ±146%  numa-meminfo.node3.AnonPages
    323960 ± 34%     -69.7%      98297 ±102%  numa-meminfo.node3.AnonPages.max
     49659 ± 55%     -35.6%      32001 ±  3%  numa-meminfo.node3.Mapped
    273725 ±  3%     +21.4%     332416        numa-meminfo.node3.Shmem
    217049           -12.9%     189123        perf-bench-sched-pipe.ops_per_sec
     46.09           +14.8%      52.90        perf-bench-sched-pipe.time.elapsed_time
     46.09           +14.8%      52.90        perf-bench-sched-pipe.time.elapsed_time.max
     91.83           -12.2%      80.67        perf-bench-sched-pipe.time.percent_of_cpu_this_job_got
    243570            +5.6%     257111        proc-vmstat.nr_active_anon
    956368            +1.5%     970579        proc-vmstat.nr_file_pages
     28172            -5.2%      26711        proc-vmstat.nr_mapped
     71358           +19.9%      85579        proc-vmstat.nr_shmem
    243570            +5.6%     257111        proc-vmstat.nr_zone_active_anon
    454916            +4.5%     475605        proc-vmstat.pgfault
     49587 ±  4%      +8.5%      53826 ±  4%  sched_debug.cpu.clock.avg
     49598 ±  4%      +8.5%      53836 ±  4%  sched_debug.cpu.clock.max
     49568 ±  4%      +8.6%      53812 ±  4%  sched_debug.cpu.clock.min
     49456 ±  4%      +8.6%      53686 ±  4%  sched_debug.cpu.clock_task.avg
     49580 ±  4%      +8.5%      53811 ±  4%  sched_debug.cpu.clock_task.max
     49577 ±  4%      +8.5%      53816 ±  4%  sched_debug.cpu_clk
     48580 ±  4%      +8.7%      52811 ±  4%  sched_debug.ktime
     50297 ±  4%      +9.3%      54960 ±  5%  sched_debug.sched_clk
 2.211e+09           -15.9%  1.859e+09 ±  3%  perf-stat.i.branch-instructions
  60344331 ±  2%     -13.6%   52146391 ±  2%  perf-stat.i.branch-misses
   4974491 ±  5%     -15.9%    4185197 ±  7%  perf-stat.i.cache-misses
  80956268 ±  3%     -12.0%   71234171 ±  7%  perf-stat.i.cache-references
    854948           -12.6%     747062        perf-stat.i.context-switches
      1.19            +3.9%       1.24 ±  2%  perf-stat.i.cpi
 1.148e+10 ±  2%     -13.9%  9.886e+09        perf-stat.i.cpu-cycles
 1.031e+10           -14.7%   8.79e+09 ±  3%  perf-stat.i.instructions
      4.45           -12.7%       3.89        perf-stat.i.metric.K/sec
      7605 ±  2%      -7.4%       7042 ±  2%  perf-stat.i.minor-faults
      7605 ±  2%      -7.4%       7042 ±  2%  perf-stat.i.page-faults
      2.73            +0.1        2.80        perf-stat.overall.branch-miss-rate%
 2.162e+09           -15.7%  1.823e+09 ±  3%  perf-stat.ps.branch-instructions
  58974968 ±  2%     -13.3%   51107382 ±  2%  perf-stat.ps.branch-misses
   4857873 ±  5%     -15.6%    4100205 ±  7%  perf-stat.ps.cache-misses
  79195784 ±  3%     -11.8%   69883377 ±  7%  perf-stat.ps.cache-references
    836709           -12.4%     733164        perf-stat.ps.context-switches
 1.123e+10 ±  2%     -13.7%  9.698e+09        perf-stat.ps.cpu-cycles
 1.008e+10           -14.5%   8.62e+09 ±  3%  perf-stat.ps.instructions
      7359 ±  2%      -7.1%       6839 ±  2%  perf-stat.ps.minor-faults
      7359 ±  2%      -7.1%       6839 ±  2%  perf-stat.ps.page-faults
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.console_flush_all.console_unlock.vprintk_emit.devkmsg_emit.devkmsg_write
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.console_unlock.vprintk_emit.devkmsg_emit.devkmsg_write.vfs_write
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.devkmsg_emit.devkmsg_write.vfs_write.ksys_write.do_syscall_64
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.devkmsg_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.vprintk_emit.devkmsg_emit.devkmsg_write.vfs_write.ksys_write
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.calltrace.cycles-pp.write
      3.61 ±106%      -3.0        0.62 ±223%  perf-profile.calltrace.cycles-pp.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit.devkmsg_emit
      3.45 ±105%      -2.9        0.56 ±223%  perf-profile.calltrace.cycles-pp.wait_for_lsr.serial8250_console_write.console_flush_all.console_unlock.vprintk_emit
      1.80 ± 53%      +1.0        2.84 ± 44%  perf-profile.calltrace.cycles-pp.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.80 ± 53%      +1.0        2.84 ± 44%  perf-profile.calltrace.cycles-pp.do_exit.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64
      1.80 ± 53%      +1.0        2.84 ± 44%  perf-profile.calltrace.cycles-pp.do_group_exit.__x64_sys_exit_group.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.80 ± 53%      +1.0        2.84 ± 44%  perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.children.cycles-pp.console_flush_all
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.children.cycles-pp.console_unlock
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.children.cycles-pp.devkmsg_emit
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.children.cycles-pp.devkmsg_write
      3.97 ±107%      -3.2        0.75 ±223%  perf-profile.children.cycles-pp.vprintk_emit
      3.61 ±106%      -3.0        0.62 ±223%  perf-profile.children.cycles-pp.serial8250_console_write
      3.45 ±105%      -2.9        0.56 ±223%  perf-profile.children.cycles-pp.wait_for_lsr
      1.80 ± 53%      +1.0        2.84 ± 44%  perf-profile.children.cycles-pp.__x64_sys_exit_group


***************************************************************************************************
lkp-skl-d07: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz (Skylake) with 16G memory
=========================================================================================
cluster/compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/testcase:
  cs-localhost/gcc-12/performance/x86_64-rhel-9.4/debian-12-x86_64-20240206.cgz/600s/lkp-skl-d07/qperf

commit: 
  v6.17-rc2
  779b1a1cb1 ("cpuidle: governors: menu: Avoid selecting states with too much latency")

       v6.17-rc2 779b1a1cb13ae17028aeddb2fbb 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    861.00 ± 26%     -29.1%     610.40 ± 19%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
    280319            -4.7%     267006        vmstat.system.cs
      4510            +4.3%       4706        qperf.tcp.latency
     13138 ± 17%     -28.1%       9451 ± 17%  qperf.time.involuntary_context_switches
  34417677            -4.8%   32782157        qperf.time.voluntary_context_switches
      3362            +8.1%       3634        qperf.udp.latency
 8.011e+08            -4.6%  7.642e+08        perf-stat.i.branch-instructions
  12882352            -3.6%   12424168        perf-stat.i.branch-misses
    281595            -4.7%     268383        perf-stat.i.context-switches
      1.54            -5.3%       1.46        perf-stat.i.cpi
 5.769e+09           -10.5%  5.161e+09        perf-stat.i.cpu-cycles
     44.97 ±  2%      -8.6%      41.11 ±  3%  perf-stat.i.cpu-migrations
     16208 ±  2%     -13.4%      14038 ±  3%  perf-stat.i.cycles-between-cache-misses
 4.094e+09            -4.1%  3.927e+09        perf-stat.i.instructions
      0.68            +7.6%       0.73        perf-stat.i.ipc
     35.20            -4.7%      33.55        perf-stat.i.metric.K/sec
      1.41           -16.0%       1.18 ± 33%  perf-stat.overall.cpi
 7.999e+08           -13.9%  6.888e+08 ± 33%  perf-stat.ps.branch-instructions
  12865498           -13.0%   11190853 ± 33%  perf-stat.ps.branch-misses
    281062           -14.2%     241224 ± 33%  perf-stat.ps.context-switches
  5.76e+09           -19.2%  4.652e+09 ± 33%  perf-stat.ps.cpu-cycles
     44.89 ±  2%     -17.5%      37.01 ± 33%  perf-stat.ps.cpu-migrations
 4.088e+09           -13.4%  3.539e+09 ± 33%  perf-stat.ps.instructions
 2.458e+12           -13.4%  2.128e+12 ± 33%  perf-stat.total.instructions
     24.74 ±  5%     -24.7        0.00        perf-profile.calltrace.cycles-pp.poll_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.80 ±  4%      -0.5        5.30 ±  3%  perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp
      1.01 ± 11%      +0.3        1.27 ± 12%  perf-profile.calltrace.cycles-pp.entry_SYSRETQ_unsafe_stack.read
      0.22 ±122%      +0.4        0.66 ±  8%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      0.16 ±152%      +0.5        0.62 ±  8%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      6.05 ±  3%      +0.6        6.64 ±  4%  perf-profile.calltrace.cycles-pp.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
      5.92 ±  3%      +0.6        6.51 ±  4%  perf-profile.calltrace.cycles-pp.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
      6.73 ±  3%      +0.8        7.51 ±  4%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.read
      6.69 ±  3%      +0.8        7.48 ±  4%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.read
     22.08 ±  5%     +19.4       41.52 ±  7%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
     24.80 ±  6%     -24.8        0.00        perf-profile.children.cycles-pp.poll_idle
      0.38 ±  9%      -0.2        0.19 ± 19%  perf-profile.children.cycles-pp.local_clock_noinstr
      0.93 ±  6%      -0.2        0.76 ±  5%  perf-profile.children.cycles-pp.native_sched_clock
      0.35 ± 10%      -0.1        0.28 ±  7%  perf-profile.children.cycles-pp.cpuidle_governor_latency_req
      0.14 ± 11%      -0.0        0.09 ± 14%  perf-profile.children.cycles-pp.call_function_single_prep_ipi
      0.22 ± 12%      -0.0        0.17 ± 12%  perf-profile.children.cycles-pp.get_cpu_device
      0.19 ±  8%      +0.1        0.25 ±  9%  perf-profile.children.cycles-pp.rseq_update_cpu_node_id
      0.48 ±  6%      +0.1        0.55 ±  4%  perf-profile.children.cycles-pp.hrtimer_interrupt
      0.49 ±  7%      +0.1        0.56 ±  4%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      0.16 ± 15%      +0.1        0.24 ±  9%  perf-profile.children.cycles-pp.run_client_udp_lat
      0.28 ± 12%      +0.1        0.36 ±  9%  perf-profile.children.cycles-pp.__softirqentry_text_end
      0.00            +0.1        0.09 ± 23%  perf-profile.children.cycles-pp.write@plt
      0.25 ± 13%      +0.1        0.34 ± 13%  perf-profile.children.cycles-pp.__netif_rx
      0.46 ±  7%      +0.1        0.56 ±  8%  perf-profile.children.cycles-pp.check_heap_object
      0.24 ± 14%      +0.1        0.35 ± 12%  perf-profile.children.cycles-pp.move_addr_to_user
      0.09 ± 14%      +0.1        0.19 ± 18%  perf-profile.children.cycles-pp.siphash_3u32
      0.19 ± 19%      +0.1        0.32 ± 11%  perf-profile.children.cycles-pp.__ip_select_ident
      0.59 ±  7%      +0.2        0.76 ±  7%  perf-profile.children.cycles-pp.__rseq_handle_notify_resume
      0.87 ±  9%      +0.2        1.04 ±  6%  perf-profile.children.cycles-pp.__check_object_size
      0.59 ± 11%      +0.2        0.78 ±  9%  perf-profile.children.cycles-pp.__ip_make_skb
      0.88 ±  9%      +0.2        1.09 ±  9%  perf-profile.children.cycles-pp.exit_to_user_mode_loop
      3.18 ±  3%      +0.3        3.43 ±  5%  perf-profile.children.cycles-pp.schedule_idle
      3.59 ±  6%      +0.5        4.09 ±  7%  perf-profile.children.cycles-pp.__flush_smp_call_function_queue
      3.83 ±  6%      +0.5        4.37 ±  6%  perf-profile.children.cycles-pp.flush_smp_call_function_queue
      5.92 ±  3%      +0.6        6.51 ±  4%  perf-profile.children.cycles-pp.vfs_read
      6.05 ±  3%      +0.6        6.65 ±  4%  perf-profile.children.cycles-pp.ksys_read
     22.08 ±  5%     +19.5       41.53 ±  7%  perf-profile.children.cycles-pp.intel_idle
     24.57 ±  6%     -24.6        0.00        perf-profile.self.cycles-pp.poll_idle
      0.90 ±  7%      -0.2        0.73 ±  5%  perf-profile.self.cycles-pp.native_sched_clock
      0.13 ± 11%      -0.0        0.08 ± 18%  perf-profile.self.cycles-pp.call_function_single_prep_ipi
      0.21 ± 12%      -0.0        0.17 ± 11%  perf-profile.self.cycles-pp.get_cpu_device
      0.08 ± 25%      +0.0        0.12 ± 10%  perf-profile.self.cycles-pp.write
      0.19 ±  7%      +0.1        0.25 ±  9%  perf-profile.self.cycles-pp.rseq_update_cpu_node_id
      0.11 ± 18%      +0.1        0.18 ± 10%  perf-profile.self.cycles-pp.run_client_udp_lat
      0.11 ± 12%      +0.1        0.19 ± 13%  perf-profile.self.cycles-pp.recvfrom
      0.12 ± 17%      +0.1        0.20 ± 14%  perf-profile.self.cycles-pp.run_server_udp_lat
      0.21 ± 15%      +0.1        0.29 ± 17%  perf-profile.self.cycles-pp.enqueue_entity
      0.00            +0.1        0.08 ± 22%  perf-profile.self.cycles-pp.write@plt
      0.25 ±  9%      +0.1        0.33 ±  7%  perf-profile.self.cycles-pp.__softirqentry_text_end
      0.10 ± 18%      +0.1        0.19 ± 11%  perf-profile.self.cycles-pp.read
      0.09 ± 13%      +0.1        0.19 ± 19%  perf-profile.self.cycles-pp.siphash_3u32
      0.22 ± 15%      +0.1        0.34 ± 12%  perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.47 ± 13%      +0.1        0.60 ± 11%  perf-profile.self.cycles-pp.do_syscall_64
      0.29 ± 14%      +0.2        0.45 ± 12%  perf-profile.self.cycles-pp.schedule_timeout
      0.31 ± 10%      +0.3        0.61 ± 12%  perf-profile.self.cycles-pp.__skb_recv_udp
     22.08 ±  5%     +19.4       41.52 ±  7%  perf-profile.self.cycles-pp.intel_idle





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ