lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202505191021.9e9f0ba2-lkp@intel.com>
Date: Mon, 19 May 2025 12:56:22 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Dave Hansen <dave.hansen@...ux.intel.com>, Josh Poimboeuf
	<jpoimboe@...nel.org>, Alexandre Chartre <alexandre.chartre@...cle.com>,
	<linux-doc@...r.kernel.org>, <oliver.sang@...el.com>
Subject: [linus:master] [x86/its]  f4818881c4:  aim7.jobs-per-min 1.2%
 regression



Hello,

kernel test robot noticed a 1.2% regression of aim7.jobs-per-min on:


commit: f4818881c47fd91fcb6d62373c57c7844e3de1c0 ("x86/its: Enable Indirect Target Selection mitigation")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

[still regression on linus/master      fee3e843b309444f48157e2188efa6818bae85cf]
[still regression on linux-next/master 484803582c77061b470ac64a634f25f89715be3f]

testcase: aim7
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
parameters:

	disk: 4BRD_12G
	md: RAID1
	fs: xfs
	test: disk_src
	load: 3000
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+--------------------------------------------------------------------------------------------+
| testcase: change | netperf: netperf.Throughput_Mbps  1.8% regression                                          |
| test machine     | 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory |
| test parameters  | cluster=cs-localhost                                                                       |
|                  | cpufreq_governor=performance                                                               |
|                  | ip=ipv4                                                                                    |
|                  | nr_threads=200%                                                                            |
|                  | runtime=300s                                                                               |
|                  | test=UDP_STREAM                                                                            |
+------------------+--------------------------------------------------------------------------------------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202505191021.9e9f0ba2-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250519/202505191021.9e9f0ba2-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs/kconfig/load/md/rootfs/tbox_group/test/testcase:
  gcc-12/performance/4BRD_12G/xfs/x86_64-rhel-9.4/3000/RAID1/debian-12-x86_64-20240206.cgz/lkp-csl-2sp3/disk_src/aim7

commit: 
  a75bf27fe4 ("x86/its: Add support for ITS-safe return thunk")
  f4818881c4 ("x86/its: Enable Indirect Target Selection mitigation")

a75bf27fe41abe65 f4818881c47fd91fcb6d62373c5 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     75730            -1.2%      74795        aim7.jobs-per-min
    494.88            +2.2%     505.76        aim7.time.system_time
    170491            -1.5%     167881        proc-vmstat.nr_shmem
   1436502            +1.8%    1463032        proc-vmstat.pgfree
      0.01 ± 26%     -54.5%       0.01 ± 38%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.getname_flags.part.0
      3221 ± 10%     +24.6%       4014 ± 13%  perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      9.35 ± 54%     -52.2%       4.47 ± 61%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_noprof.security_inode_init_security.xfs_generic_create.lookup_open
      8.92 ± 36%     -48.6%       4.58 ± 54%  perf-sched.wait_time.avg.ms.schedule_preempt_disabled.rwsem_down_read_slowpath.down_read.xlog_cil_commit
      3221 ± 10%     +24.6%       4014 ± 13%  perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
 1.542e+09            +4.1%  1.606e+09        perf-stat.i.branch-instructions
      1.87            -0.1        1.77        perf-stat.i.branch-miss-rate%
     31.54            -0.4       31.16        perf-stat.i.cache-miss-rate%
      1.63            +1.3%       1.65        perf-stat.i.cpi
      0.63            -1.4%       0.62        perf-stat.i.ipc
      1.94            -0.1        1.82        perf-stat.overall.branch-miss-rate%
     31.70            -0.3       31.45        perf-stat.overall.cache-miss-rate%
      1.59            +1.1%       1.61        perf-stat.overall.cpi
      0.63            -1.1%       0.62        perf-stat.overall.ipc
 1.536e+09            +4.1%  1.599e+09        perf-stat.ps.branch-instructions
    987.24            -1.0%     977.35        perf-stat.ps.cpu-migrations
      8224 ± 10%     +28.5%      10566        sched_debug.cfs_rq:/.avg_vruntime.min
      8224 ± 10%     +28.5%      10566        sched_debug.cfs_rq:/.min_vruntime.min
    144919 ±  7%     +17.0%     169577        sched_debug.cpu.clock.avg
    144939 ±  7%     +17.0%     169596        sched_debug.cpu.clock.max
    144898 ±  7%     +17.0%     169556        sched_debug.cpu.clock.min
    144346 ±  7%     +17.0%     168892        sched_debug.cpu.clock_task.avg
    144557 ±  7%     +17.0%     169113        sched_debug.cpu.clock_task.max
    136847 ±  7%     +17.8%     161242        sched_debug.cpu.clock_task.min
     13899 ±  6%     +13.9%      15831 ±  5%  sched_debug.cpu.nr_switches.stddev
    144899 ±  7%     +17.0%     169556        sched_debug.cpu_clk
    144339 ±  7%     +17.1%     168996        sched_debug.ktime
    145461 ±  7%     +17.0%     170148        sched_debug.sched_clk
     56.39            -0.8       55.60        perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry
     46.66            -0.7       45.97        perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle
      5.72            -0.3        5.46 ±  2%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.intel_idle_irq.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call
      0.64 ±  3%      -0.0        0.60 ±  2%  perf-profile.calltrace.cycles-pp.xfs_buf_item_release.xlog_cil_commit.__xfs_trans_commit.xfs_trans_commit.xfs_create
      1.10 ±  2%      +0.1        1.17        perf-profile.calltrace.cycles-pp.enqueue_task_fair.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue
      1.22 ±  3%      +0.1        1.29        perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue.do_idle
      1.18 ±  2%      +0.1        1.26        perf-profile.calltrace.cycles-pp.enqueue_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.flush_smp_call_function_queue
      4.92            -0.2        4.71 ±  2%  perf-profile.children.cycles-pp.intel_idle_irq
     10.59            -0.2       10.41        perf-profile.children.cycles-pp.__xfs_trans_commit
     10.65            -0.2       10.47        perf-profile.children.cycles-pp.xfs_trans_commit
      9.36            -0.1        9.23        perf-profile.children.cycles-pp.xlog_cil_commit
      0.22 ±  7%      -0.0        0.18 ±  6%  perf-profile.children.cycles-pp.xlog_ticket_alloc
      0.15 ±  3%      -0.0        0.13 ±  2%  perf-profile.children.cycles-pp.xfs_buf_rele_cached
      0.24 ±  5%      +0.0        0.28 ±  9%  perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
      1.13            +0.0        1.18 ±  2%  perf-profile.children.cycles-pp.try_to_block_task
      1.40 ±  3%      +0.1        1.48        perf-profile.children.cycles-pp.enqueue_task_fair
      1.66 ±  2%      +0.1        1.75        perf-profile.children.cycles-pp.sched_ttwu_pending
      0.00            +0.7        0.70 ±  2%  perf-profile.children.cycles-pp.its_return_thunk
      4.37            -0.2        4.13 ±  2%  perf-profile.self.cycles-pp.intel_idle_irq
      0.14 ±  5%      -0.0        0.12 ±  6%  perf-profile.self.cycles-pp.xfs_trans_precommit_sort
      0.06 ±  7%      +0.0        0.08 ±  6%  perf-profile.self.cycles-pp.__update_blocked_fair
      0.09 ±  4%      +0.0        0.11 ±  8%  perf-profile.self.cycles-pp.enqueue_task_fair
      0.70 ±  4%      +0.0        0.74 ±  2%  perf-profile.self.cycles-pp.xlog_cil_alloc_shadow_bufs
      0.00            +0.6        0.56 ±  2%  perf-profile.self.cycles-pp.its_return_thunk


***************************************************************************************************
lkp-icl-2sp2: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
  cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-9.4/200%/debian-12-x86_64-20240206.cgz/300s/lkp-icl-2sp2/UDP_STREAM/netperf

commit: 
  a75bf27fe4 ("x86/its: Add support for ITS-safe return thunk")
  f4818881c4 ("x86/its: Enable Indirect Target Selection mitigation")

a75bf27fe41abe65 f4818881c47fd91fcb6d62373c5 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2058 ±  4%     +11.7%       2298 ±  6%  perf-c2c.HITM.local
   7436735            -3.0%    7213042        vmstat.system.cs
      5.72 ± 49%   +3259.9%     192.19 ±185%  perf-sched.sch_delay.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
      5.72 ± 49%   +3259.9%     192.19 ±185%  perf-sched.wait_time.max.ms.irqentry_exit_to_user_mode.asm_sysvec_reschedule_ipi.[unknown]
 3.223e+09            -1.7%   3.17e+09        proc-vmstat.numa_hit
 3.222e+09            -1.6%  3.169e+09        proc-vmstat.numa_local
 2.574e+10            -1.6%  2.532e+10        proc-vmstat.pgalloc_normal
 2.574e+10            -1.6%  2.532e+10        proc-vmstat.pgfree
     29701            -2.1%      29079        netperf.ThroughputBoth_Mbps
   7574004            -2.0%    7425040        netperf.ThroughputBoth_total_Mbps
      8150            -2.9%       7916        netperf.ThroughputRecv_Mbps
     21551            -1.8%      21162        netperf.Throughput_Mbps
   5495563            -1.7%    5403562        netperf.Throughput_total_Mbps
 1.142e+09            -3.1%  1.107e+09        netperf.time.involuntary_context_switches
 4.336e+09            -2.0%  4.251e+09        netperf.workload
  2.52e+10            +3.4%  2.605e+10        perf-stat.i.branch-instructions
      0.88            -0.0        0.83        perf-stat.i.branch-miss-rate%
 2.196e+08            -2.2%  2.148e+08        perf-stat.i.branch-misses
   7497258            -3.1%    7265561        perf-stat.i.context-switches
     58.57            -3.1%      56.76        perf-stat.i.metric.K/sec
      0.87            -0.0        0.82        perf-stat.overall.branch-miss-rate%
      2.19            +1.1%       2.21        perf-stat.overall.cpi
 2.511e+10            +3.4%  2.596e+10        perf-stat.ps.branch-instructions
 2.189e+08            -2.2%  2.141e+08        perf-stat.ps.branch-misses
   7471654            -3.1%    7240223        perf-stat.ps.context-switches





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ