lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 24 Apr 2023 14:59:47 +0800
From:   kernel test robot <yujie.liu@...el.com>
To:     Chen Yu <yu.c.chen@...el.com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Tim Chen <tim.c.chen@...el.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        "Steven Rostedt" <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>,
        "K Prateek Nayak" <kprateek.nayak@....com>,
        Abel Wu <wuyun.abel@...edance.com>,
        "Yicong Yang" <yangyicong@...ilicon.com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>,
        Honglei Wang <wanghonglei@...ichuxing.com>,
        "Len Brown" <len.brown@...el.com>,
        Chen Yu <yu.chen.surf@...il.com>,
        Tianchen Ding <dtcccc@...ux.alibaba.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Josh Don <joshdon@...gle.com>, Hillf Danton <hdanton@...a.com>,
        kernel test robot <yujie.liu@...el.com>,
        Arjan Van De Ven <arjan.van.de.ven@...el.com>,
        "Aaron Lu" <aaron.lu@...el.com>, <linux-kernel@...r.kernel.org>,
        Chen Yu <yu.c.chen@...el.com>
Subject: Re: [PATCH v7 0/2] sched/fair: Introduce SIS_CURRENT to wake up
 short task on current CPU

Hello,

kernel test robot noticed a 2250.1% improvement of stress-ng.switch.ops_per_sec on:

patch: "sched/fair: Introduce SIS_CURRENT to wake up short task on current CPU"

testcase: stress-ng
test machine: 224 threads 2 sockets (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	sc_pid_max: 4194304
	class: scheduler
	test: switch
	cpufreq_governor: performance


Details are as below:

=========================================================================================
class/compiler/cpufreq_governor/kconfig/nr_threads/rootfs/sc_pid_max/tbox_group/test/testcase/testtime:
  scheduler/gcc-11/performance/x86_64-rhel-8.3/100%/debian-11.1-x86_64-20220510.cgz/4194304/lkp-spr-r02/switch/stress-ng/60s

commit: 
  dac54350b7 ("sched/fair: Record the average duration of a task")
  f153e964b7 ("sched/fair: Introduce SIS_CURRENT to wake up short task on current CPU")

dac54350b7363c69 f153e964b7d24fd0375f0efad66 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  1.49e+08 ±  6%   +2250.2%  3.503e+09        stress-ng.switch.ops
   2483795 ±  6%   +2250.1%   58370891        stress-ng.switch.ops_per_sec
     69985 ±  4%  +4.4e+06%  3.088e+09        stress-ng.time.involuntary_context_switches
     31001            -5.1%      29406        stress-ng.time.minor_page_faults
     12849           +60.9%      20680        stress-ng.time.percent_of_cpu_this_job_got
      7325           +47.5%      10802        stress-ng.time.system_time
    672.80          +209.1%       2079        stress-ng.time.user_time
 2.836e+08 ±  6%   +1273.7%  3.895e+09        stress-ng.time.voluntary_context_switches
     14182           -14.6%      12109        uptime.idle
  2.93e+09           -64.8%   1.03e+09        cpuidle..time
 1.946e+08 ±  5%     -82.0%   34997812 ± 41%  cpuidle..usage
     52479 ± 14%     +90.9%     100199 ±  2%  meminfo.Active
     52368 ± 14%     +91.1%     100095 ±  2%  meminfo.Active(anon)
     51616 ± 15%     +90.0%      98071 ±  2%  numa-meminfo.node1.Active
     51523 ± 15%     +90.3%      98037 ±  2%  numa-meminfo.node1.Active(anon)
     12896 ± 15%     +90.7%      24589 ±  2%  numa-vmstat.node1.nr_active_anon
     12896 ± 15%     +90.7%      24589 ±  2%  numa-vmstat.node1.nr_zone_active_anon
     29.58           -21.7        7.89 ±  2%  mpstat.cpu.all.idle%
      6.86            -5.6        1.28 ±  3%  mpstat.cpu.all.irq%
      0.41 ±  6%      -0.4        0.05 ±  5%  mpstat.cpu.all.soft%
     57.53           +17.9       75.48        mpstat.cpu.all.sys%
      5.62            +9.7       15.30        mpstat.cpu.all.usr%
     31.00           -68.8%       9.67 ±  4%  vmstat.cpu.id
      5.00          +183.3%      14.17 ±  2%  vmstat.cpu.us
    185.00           +59.0%     294.17        vmstat.procs.r
   7520613 ±  6%   +1324.1%  1.071e+08        vmstat.system.cs
    819272 ±  4%     -35.1%     531594 ± 20%  vmstat.system.in
     13105 ± 14%     +91.1%      25040 ±  2%  proc-vmstat.nr_active_anon
    135920            -6.0%     127705        proc-vmstat.nr_inactive_anon
     57438 ±  3%      +6.0%      60898        proc-vmstat.nr_shmem
     13105 ± 14%     +91.1%      25040 ±  2%  proc-vmstat.nr_zone_active_anon
    135920            -6.0%     127705        proc-vmstat.nr_zone_inactive_anon
    905140            +1.4%     917932        proc-vmstat.numa_hit
    702189            +1.8%     714949        proc-vmstat.numa_local
      2252 ± 13%     +91.7%       4317 ±  2%  proc-vmstat.pgactivate
    987449            +1.2%     998848        proc-vmstat.pgalloc_normal
      2327           +16.0%       2700        turbostat.Avg_MHz
     81.46           +12.1       93.60        turbostat.Busy%
  23636974 ± 25%     -87.4%    2970987 ± 18%  turbostat.C1
      1.60 ± 20%      -1.4        0.15 ± 14%  turbostat.C1%
 1.678e+08 ±  3%     -93.0%   11676240 ±  5%  turbostat.C1E
     15.38 ±  2%     -12.1        3.25 ±  2%  turbostat.C1E%
     18.52           -65.5%       6.38 ±  2%  turbostat.CPU%c1
      0.06 ±  7%    +773.7%       0.55        turbostat.IPC
  53133659 ±  4%     -34.6%   34758470 ± 21%  turbostat.IRQ
   2707361 ±  2%    +633.0%   19845296 ± 69%  turbostat.POLL
    551.77           +21.5%     670.20        turbostat.PkgWatt
     17.13            +8.1%      18.52        turbostat.RAMWatt
    606383 ± 18%     -97.7%      13786 ± 44%  sched_debug.cfs_rq:/.MIN_vruntime.avg
   1289912 ±  8%     -84.0%     205876 ± 44%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
      0.55 ±  4%     +57.7%       0.87        sched_debug.cfs_rq:/.h_nr_running.avg
      3.83 ± 22%     -47.8%       2.00 ± 14%  sched_debug.cfs_rq:/.h_nr_running.max
      0.60 ±  8%     -41.3%       0.36 ±  4%  sched_debug.cfs_rq:/.h_nr_running.stddev
    606383 ± 18%     -97.7%      13786 ± 44%  sched_debug.cfs_rq:/.max_vruntime.avg
   1289912 ±  8%     -84.0%     205876 ± 44%  sched_debug.cfs_rq:/.max_vruntime.stddev
   3441744           +73.7%    5979755        sched_debug.cfs_rq:/.min_vruntime.avg
   3673914           +70.4%    6258914        sched_debug.cfs_rq:/.min_vruntime.max
   2245166           +65.6%    3719103        sched_debug.cfs_rq:/.min_vruntime.min
     88019 ±  2%    +106.3%     181555 ±  7%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.38 ±  4%     +40.3%       0.53        sched_debug.cfs_rq:/.nr_running.avg
      0.34 ±  3%     -61.9%       0.13 ± 26%  sched_debug.cfs_rq:/.nr_running.stddev
    622.22 ±  2%     +33.3%     829.67        sched_debug.cfs_rq:/.runnable_avg.avg
      2083 ± 13%     -10.9%       1855 ±  5%  sched_debug.cfs_rq:/.runnable_avg.max
    320.17 ±  5%     -45.6%     174.13 ± 10%  sched_debug.cfs_rq:/.runnable_avg.stddev
  -1218600           +84.3%   -2246378        sched_debug.cfs_rq:/.spread0.min
     87973 ±  2%    +107.2%     182295 ±  7%  sched_debug.cfs_rq:/.spread0.stddev
    400.48 ±  2%     +45.1%     581.11        sched_debug.cfs_rq:/.util_avg.avg
    188.48 ±  6%     -20.9%     149.09 ±  9%  sched_debug.cfs_rq:/.util_avg.stddev
     75.30 ±  8%    +411.6%     385.20 ±  2%  sched_debug.cfs_rq:/.util_est_enqueued.avg
    106.05 ±  7%     +53.1%     162.35 ±  6%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
    648805 ±  6%     +55.8%    1010843 ± 13%  sched_debug.cpu.avg_idle.max
     86024 ±  6%     +44.6%     124370 ± 11%  sched_debug.cpu.avg_idle.stddev
      2332 ±  7%     +81.6%       4235        sched_debug.cpu.curr->pid.avg
      2566 ±  2%     -77.1%     587.02 ±  7%  sched_debug.cpu.curr->pid.stddev
      0.47 ±  6%     +65.4%       0.77        sched_debug.cpu.nr_running.avg
      0.56 ±  2%     -34.3%       0.37 ±  4%  sched_debug.cpu.nr_running.stddev
   1031696 ±  6%   +1343.5%   14892376        sched_debug.cpu.nr_switches.avg
   1140186 ±  4%   +1348.9%   16520256        sched_debug.cpu.nr_switches.max
    678987 ±  6%   +1130.3%    8353556 ± 15%  sched_debug.cpu.nr_switches.min
     54009 ± 29%   +1497.3%     862684 ± 11%  sched_debug.cpu.nr_switches.stddev
  58611259          +100.0%  1.172e+08        sched_debug.sysctl_sched.sysctl_sched_features
     15.59           -90.5%       1.47 ±  4%  perf-stat.i.MPKI
 1.127e+10 ±  5%    +912.3%  1.141e+11        perf-stat.i.branch-instructions
      1.43            -0.5        0.95        perf-stat.i.branch-miss-rate%
 1.572e+08 ±  6%    +529.5%  9.894e+08        perf-stat.i.branch-misses
      1.27 ± 14%     +29.3       30.61 ±  6%  perf-stat.i.cache-miss-rate%
   5226056 ±  7%    +840.6%   49157630 ±  5%  perf-stat.i.cache-misses
 9.044e+08 ±  6%     -76.3%  2.145e+08 ±  2%  perf-stat.i.cache-references
   7735617 ±  6%   +1348.7%  1.121e+08        perf-stat.i.context-switches
      8.96 ±  5%     -82.5%       1.56 ±  4%  perf-stat.i.cpi
 5.207e+11           +17.6%  6.122e+11        perf-stat.i.cpu-cycles
   2826254 ±  6%     -92.6%     210370 ±  5%  perf-stat.i.cpu-migrations
    105045 ±  7%     -82.7%      18181 ±  6%  perf-stat.i.cycles-between-cache-misses
      0.37 ±  7%      -0.3        0.03 ±  7%  perf-stat.i.dTLB-load-miss-rate%
  59575414 ±  9%     -92.4%    4518176 ±  8%  perf-stat.i.dTLB-load-misses
 1.541e+10 ±  5%    +969.1%  1.647e+11        perf-stat.i.dTLB-loads
      0.08            -0.1        0.01 ±  7%  perf-stat.i.dTLB-store-miss-rate%
   6651717 ±  5%     -89.8%     680779 ±  5%  perf-stat.i.dTLB-store-misses
 8.526e+09 ±  6%   +1076.1%  1.003e+11        perf-stat.i.dTLB-stores
 5.657e+10 ±  5%    +905.0%  5.686e+11        perf-stat.i.instructions
      0.14 ±  7%    +536.8%       0.91        perf-stat.i.ipc
      2.32           +17.7%       2.73        perf-stat.i.metric.GHz
     94.15 ±  4%   +1221.0%       1243        perf-stat.i.metric.K/sec
    160.83 ±  5%    +951.6%       1691        perf-stat.i.metric.M/sec
     96.30            +2.7       99.03        perf-stat.i.node-load-miss-rate%
   2066362 ±  8%   +1017.1%   23083138 ±  5%  perf-stat.i.node-load-misses
     78248 ± 11%     -45.9%      42316 ± 11%  perf-stat.i.node-loads
     16.04           -97.7%       0.37        perf-stat.overall.MPKI
      1.39            -0.5        0.87        perf-stat.overall.branch-miss-rate%
      0.57 ± 12%     +22.7       23.25 ±  7%  perf-stat.overall.cache-miss-rate%
      9.27 ±  5%     -88.4%       1.07        perf-stat.overall.cpi
    102426 ±  7%     -87.8%      12467 ±  5%  perf-stat.overall.cycles-between-cache-misses
      0.39 ±  6%      -0.4        0.00 ±  9%  perf-stat.overall.dTLB-load-miss-rate%
      0.08            -0.1        0.00 ±  6%  perf-stat.overall.dTLB-store-miss-rate%
      0.11 ±  5%    +759.8%       0.93        perf-stat.overall.ipc
     96.35            +3.5       99.82        perf-stat.overall.node-load-miss-rate%
 1.094e+10 ±  5%    +927.6%  1.124e+11        perf-stat.ps.branch-instructions
 1.522e+08 ±  6%    +540.5%  9.746e+08        perf-stat.ps.branch-misses
   4975869 ±  6%    +873.2%   48427022 ±  5%  perf-stat.ps.cache-misses
 8.813e+08 ±  6%     -76.3%  2.086e+08 ±  2%  perf-stat.ps.cache-references
   7529020 ±  6%   +1367.2%  1.105e+08        perf-stat.ps.context-switches
    216929            +1.3%     219851        perf-stat.ps.cpu-clock
 5.072e+11           +18.7%  6.019e+11        perf-stat.ps.cpu-cycles
   2754210 ±  6%     -92.8%     198251 ±  5%  perf-stat.ps.cpu-migrations
  58032406 ±  9%     -92.6%    4269838 ±  8%  perf-stat.ps.dTLB-load-misses
 1.496e+10 ±  5%    +984.7%  1.623e+11        perf-stat.ps.dTLB-loads
   6474992 ±  5%     -90.0%     647675 ±  5%  perf-stat.ps.dTLB-store-misses
 8.286e+09 ±  6%   +1092.6%  9.882e+10        perf-stat.ps.dTLB-stores
 5.491e+10 ±  5%    +920.3%  5.602e+11        perf-stat.ps.instructions
   1984455 ±  7%   +1046.4%   22749741 ±  5%  perf-stat.ps.node-load-misses
     74637 ± 12%     -44.7%      41249 ± 11%  perf-stat.ps.node-loads
    216929            +1.3%     219851        perf-stat.ps.task-clock
 3.397e+12 ±  6%    +921.9%  3.471e+13        perf-stat.total.instructions



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

View attachment "config-6.3.0-rc3-00015-gf153e964b7d2" of type "text/plain" (157282 bytes)

View attachment "job-script" of type "text/plain" (8275 bytes)

View attachment "job.yaml" of type "text/plain" (5926 bytes)

View attachment "reproduce" of type "text/plain" (384 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ