linux-kernel - Re: [PATCH v3] sched/fair: Forfeit vruntime on yield

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202509261113.a87577ce-lkp@intel.com>
Date: Fri, 26 Sep 2025 12:56:49 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Fernand Sieber <sieberf@...zon.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>, <peterz@...radead.org>,
	<bsegall@...gle.com>, <dietmar.eggemann@....com>, <dwmw@...zon.co.uk>,
	<graf@...zon.com>, <jschoenh@...zon.de>, <juri.lelli@...hat.com>,
	<mingo@...hat.com>, <sieberf@...zon.com>, <tanghui20@...wei.com>,
	<vincent.guittot@...aro.org>, <vineethr@...ux.ibm.com>,
	<wangtao554@...wei.com>, <zhangqiao22@...wei.com>, <oliver.sang@...el.com>
Subject: Re: [PATCH v3] sched/fair: Forfeit vruntime on yield


Hello,


we reported "a 55.9% improvement of stress-ng.wait.ops_per_sec"
in https://lore.kernel.org/all/202509241501.f14b210a-lkp@intel.com/

now we noticed there is also a regression in our tests. report again FYI.

one thing we want to mention is the "stress-ng.sockpair.MB_written_per_sec" is
in "miscellaneous metrics" of this stress-ng test. for major part,
"stress-ng.sockpair.ops_per_sec", it's just a small difference.

0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    551.38           -90.5%      52.18        stress-ng.sockpair.MB_written_per_sec
    781743            -2.3%     764106        stress-ng.sockpair.ops_per_sec


below is a test example for 15bf8c7b35:

2025-09-25 15:48:21 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --oom-avoid --sockpair 192
stress-ng: info:  [8371] setting to a 1 min run per stressor
stress-ng: info:  [8371] dispatching hogs: 192 sockpair
stress-ng: info:  [8371] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
stress-ng: metrc: [8371] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
stress-ng: metrc: [8371]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
stress-ng: metrc: [8371] sockpair       49874197     65.44     72.08  12219.54    762108.28        4057.58        97.82          3132
stress-ng: metrc: [8371] miscellaneous metrics:
stress-ng: metrc: [8371] sockpair           27717.04 socketpair calls sec (harmonic mean of 192 instances)
stress-ng: metrc: [8371] sockpair              53.01 MB written per sec (harmonic mean of 192 instances)
stress-ng: info:  [8371] for a 66.13s run time:
stress-ng: info:  [8371]   12696.46s available CPU time
stress-ng: info:  [8371]      72.07s user time   (  0.57%)
stress-ng: info:  [8371]   12219.63s system time ( 96.24%)
stress-ng: info:  [8371]   12291.70s total time  ( 96.81%)
stress-ng: info:  [8371] load average: 190.99 57.46 19.94
stress-ng: info:  [8371] skipped: 0
stress-ng: info:  [8371] passed: 192: sockpair (192)
stress-ng: info:  [8371] failed: 0
stress-ng: info:  [8371] metrics untrustworthy: 0
stress-ng: info:  [8371] successful run completed in 1 min, 6.13 secs


below is an exmple from 0d4eaf8caf:

2025-09-25 18:04:37 stress-ng --timeout 60 --times --verify --metrics --no-rand-seed --oom-avoid --sockpair 192
stress-ng: info:  [8360] setting to a 1 min run per stressor
stress-ng: info:  [8360] dispatching hogs: 192 sockpair
stress-ng: info:  [8360] note: /proc/sys/kernel/sched_autogroup_enabled is 1 and this can impact scheduling throughput for processes not attached to a tty. Setting this to 0 may improve performance metrics
stress-ng: metrc: [8360] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
stress-ng: metrc: [8360]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
stress-ng: metrc: [8360] sockpair       51705787     65.08     56.75  12254.39    794448.25        4199.92        98.52          5160
stress-ng: metrc: [8360] miscellaneous metrics:
stress-ng: metrc: [8360] sockpair           28156.62 socketpair calls sec (harmonic mean of 192 instances)
stress-ng: metrc: [8360] sockpair             562.18 MB written per sec (harmonic mean of 192 instances)
stress-ng: info:  [8360] for a 65.40s run time:
stress-ng: info:  [8360]   12556.08s available CPU time
stress-ng: info:  [8360]      56.75s user time   (  0.45%)
stress-ng: info:  [8360]   12254.48s system time ( 97.60%)
stress-ng: info:  [8360]   12311.23s total time  ( 98.05%)
stress-ng: info:  [8360] load average: 239.81 72.31 25.10
stress-ng: info:  [8360] skipped: 0
stress-ng: info:  [8360] passed: 192: sockpair (192)
stress-ng: info:  [8360] failed: 0
stress-ng: info:  [8360] metrics untrustworthy: 0
stress-ng: info:  [8360] successful run completed in 1 min, 5.40 secs


below is full report.


kernel test robot noticed a 90.5% regression of stress-ng.sockpair.MB_written_per_sec on:


commit: 15bf8c7b35e31295b26241425c0a61102e92109f ("[PATCH v3] sched/fair: Forfeit vruntime on yield")
url: https://github.com/intel-lab-lkp/linux/commits/Fernand-Sieber/sched-fair-Forfeit-vruntime-on-yield/20250918-231320
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git 0d4eaf8caf8cd633b23e949e2996b420052c2d45
patch link: https://lore.kernel.org/all/20250918150528.292620-1-sieberf@amazon.com/
patch subject: [PATCH v3] sched/fair: Forfeit vruntime on yield

testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-14
test machine: 192 threads 2 sockets Intel(R) Xeon(R) 6740E  CPU @ 2.4GHz (Sierra Forest) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: sockpair
	cpufreq_governor: performance



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202509261113.a87577ce-lkp@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250926/202509261113.a87577ce-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-14/performance/x86_64-rhel-9.4/100%/debian-13-x86_64-20250902.cgz/lkp-srf-2sp3/sockpair/stress-ng/60s

commit: 
  0d4eaf8caf ("sched/fair: Do not balance task to a throttled cfs_rq")
  15bf8c7b35 ("sched/fair: Forfeit vruntime on yield")

0d4eaf8caf8cd633 15bf8c7b35e31295b26241425c0 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.78 ±  2%      +0.2        1.02        mpstat.cpu.all.usr%
     19.57           -36.8%      12.36 ± 70%  turbostat.RAMWatt
 4.073e+08 ±  6%     +23.1%  5.013e+08 ±  5%  cpuidle..time
    266261 ±  9%     +46.4%     389733 ±  9%  cpuidle..usage
    451887 ± 77%    +160.9%    1178929 ± 33%  numa-vmstat.node0.nr_file_pages
    192819 ± 30%    +101.3%     388191 ± 43%  numa-vmstat.node1.nr_shmem
   1807416 ± 77%    +161.0%    4716665 ± 33%  numa-meminfo.node0.FilePages
   8980121            -9.0%    8174177        numa-meminfo.node0.SUnreclaim
  25356157 ±  8%     -22.0%   19772595 ±  9%  numa-meminfo.node1.MemUsed
    771480 ± 30%    +101.4%    1553932 ± 43%  numa-meminfo.node1.Shmem
    551.38           -90.5%      52.18        stress-ng.sockpair.MB_written_per_sec
  51092272            -2.2%   49968621        stress-ng.sockpair.ops
    781743            -2.3%     764106        stress-ng.sockpair.ops_per_sec
  21418332 ±  4%     +69.2%   36232510        stress-ng.time.involuntary_context_switches
     56.36           +27.4%      71.81        stress-ng.time.user_time
    150809 ± 21%  +17217.1%   26115838 ±  3%  stress-ng.time.voluntary_context_switches
   2165914 ±  7%     +92.3%    4165197 ±  4%  meminfo.Active
   2165898 ±  7%     +92.3%    4165181 ±  4%  meminfo.Active(anon)
   4926568           +39.6%    6875228        meminfo.Cached
   6826363           +28.1%    8744371        meminfo.Committed_AS
    513281 ±  8%     +98.7%    1019681 ±  6%  meminfo.Mapped
  48472806 ±  2%     -14.8%   41314088        meminfo.Memused
   1276164          +152.7%    3224818 ±  3%  meminfo.Shmem
  53022761 ±  2%     -15.7%   44672632        meminfo.max_used_kB
      0.53           -81.0%       0.10 ±  4%  perf-sched.sch_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
      0.53           -81.0%       0.10 ±  4%  perf-sched.total_sch_delay.average.ms
      2.03           -68.4%       0.64 ±  4%  perf-sched.total_wait_and_delay.average.ms
   1811449          +200.9%    5449776 ±  4%  perf-sched.total_wait_and_delay.count.ms
      1.50           -64.0%       0.54 ±  4%  perf-sched.total_wait_time.average.ms
      2.03           -68.4%       0.64 ±  4%  perf-sched.wait_and_delay.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
   1811449          +200.9%    5449776 ±  4%  perf-sched.wait_and_delay.count.[unknown].[unknown].[unknown].[unknown].[unknown]
      1.50           -64.0%       0.54 ±  4%  perf-sched.wait_time.avg.ms.[unknown].[unknown].[unknown].[unknown].[unknown]
    541937 ±  7%     +92.5%    1043389 ±  4%  proc-vmstat.nr_active_anon
   5242293            +3.5%    5423918        proc-vmstat.nr_dirty_background_threshold
  10497404            +3.5%   10861099        proc-vmstat.nr_dirty_threshold
   1232280           +39.7%    1721251        proc-vmstat.nr_file_pages
  52782357            +3.4%   54601330        proc-vmstat.nr_free_pages
  52117733            +3.8%   54073313        proc-vmstat.nr_free_pages_blocks
    128259 ±  8%    +100.8%     257594 ±  6%  proc-vmstat.nr_mapped
    319681          +153.0%     808650 ±  3%  proc-vmstat.nr_shmem
   4489133            -8.9%    4089704        proc-vmstat.nr_slab_unreclaimable
    541937 ±  7%     +92.5%    1043389 ±  4%  proc-vmstat.nr_zone_active_anon
  77303955            +2.5%   79201972        proc-vmstat.pgalloc_normal
    519724            +5.2%     546556        proc-vmstat.pgfault
  76456707            +1.7%   77739095        proc-vmstat.pgfree
  12794131 ±  6%     -27.4%    9288185        sched_debug.cfs_rq:/.avg_vruntime.max
   4610143 ±  8%     -14.9%    3923890 ±  5%  sched_debug.cfs_rq:/.avg_vruntime.min
      1.03           -20.1%       0.83 ±  2%  sched_debug.cfs_rq:/.h_nr_queued.avg
      1.03           -20.8%       0.82 ±  2%  sched_debug.cfs_rq:/.h_nr_runnable.avg
    895.00 ± 70%     +89.0%       1691 ±  2%  sched_debug.cfs_rq:/.load.min
      0.67 ± 55%    +125.0%       1.50        sched_debug.cfs_rq:/.load_avg.min
  12794131 ±  6%     -27.4%    9288185        sched_debug.cfs_rq:/.min_vruntime.max
   4610143 ±  8%     -14.9%    3923896 ±  5%  sched_debug.cfs_rq:/.min_vruntime.min
      1103           -20.2%     880.86        sched_debug.cfs_rq:/.runnable_avg.avg
    428.26 ±  6%     -63.4%     156.94 ± 22%  sched_debug.cfs_rq:/.util_est.avg
      1775 ±  6%     -39.3%       1077 ± 15%  sched_debug.cfs_rq:/.util_est.max
    396.33 ±  6%     -50.0%     198.03 ± 17%  sched_debug.cfs_rq:/.util_est.stddev
     50422 ±  6%     -34.7%      32915 ± 18%  sched_debug.cpu.avg_idle.min
    456725 ± 10%     +39.4%     636811 ±  4%  sched_debug.cpu.avg_idle.stddev
    611566 ±  5%     +25.0%     764424 ±  2%  sched_debug.cpu.max_idle_balance_cost.avg
    190657 ± 12%     +36.1%     259410 ±  5%  sched_debug.cpu.max_idle_balance_cost.stddev
      1.04           -20.4%       0.82 ±  2%  sched_debug.cpu.nr_running.avg
     57214 ±  4%    +183.5%     162228 ±  2%  sched_debug.cpu.nr_switches.avg
    253314 ±  4%     +39.3%     352777 ±  4%  sched_debug.cpu.nr_switches.max
     59410 ±  6%     +31.6%      78186 ± 10%  sched_debug.cpu.nr_switches.stddev
      3.33           -27.9%       2.40        perf-stat.i.MPKI
 1.207e+10           +11.3%  1.344e+10        perf-stat.i.branch-instructions
      0.21 ±  7%      +0.0        0.24 ±  5%  perf-stat.i.branch-miss-rate%
  23462655 ±  6%     +27.4%   29896517 ±  3%  perf-stat.i.branch-misses
     75.74            -4.4       71.33        perf-stat.i.cache-miss-rate%
 1.861e+08           -21.5%  1.462e+08        perf-stat.i.cache-misses
 2.435e+08           -17.1%  2.017e+08        perf-stat.i.cache-references
    323065 ±  5%    +191.4%     941425 ±  2%  perf-stat.i.context-switches
     10.73            -9.7%       9.69        perf-stat.i.cpi
    353.45           +39.0%     491.13 ±  4%  perf-stat.i.cpu-migrations
      3589           +30.5%       4685        perf-stat.i.cycles-between-cache-misses
 5.645e+10           +12.0%  6.323e+10        perf-stat.i.instructions
      0.09           +12.1%       0.11        perf-stat.i.ipc
      1.66 ±  5%    +193.9%       4.89 ±  2%  perf-stat.i.metric.K/sec
      6247            +5.7%       6603 ±  2%  perf-stat.i.minor-faults
      6248            +5.7%       6604 ±  2%  perf-stat.i.page-faults
      3.33           -29.7%       2.34        perf-stat.overall.MPKI
      0.20 ±  7%      +0.0        0.23 ±  4%  perf-stat.overall.branch-miss-rate%
     76.67            -3.9       72.79        perf-stat.overall.cache-miss-rate%
     10.54           -11.1%       9.37        perf-stat.overall.cpi
      3168           +26.5%       4007        perf-stat.overall.cycles-between-cache-misses
      0.09           +12.5%       0.11        perf-stat.overall.ipc
 1.204e+10           +11.1%  1.337e+10        perf-stat.ps.branch-instructions
  23586580 ±  7%     +29.7%   30600100 ±  4%  perf-stat.ps.branch-misses
 1.873e+08           -21.4%  1.471e+08        perf-stat.ps.cache-misses
 2.443e+08           -17.3%  2.021e+08        perf-stat.ps.cache-references
    324828 ±  5%    +187.0%     932274 ±  2%  perf-stat.ps.context-switches
    335.13 ±  2%     +41.7%     474.95 ±  5%  perf-stat.ps.cpu-migrations
 5.632e+10           +11.7%  6.293e+10        perf-stat.ps.instructions
      6282            +6.5%       6690 ±  2%  perf-stat.ps.minor-faults
      6284            +6.5%       6692 ±  2%  perf-stat.ps.page-faults
 3.764e+12           +12.2%  4.224e+12        perf-stat.total.instructions



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki