lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202502191317.d0050992-lkp@intel.com>
Date: Wed, 19 Feb 2025 13:46:21 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Mateusz Guzik <mjguzik@...il.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, Christian Brauner
	<brauner@...nel.org>, Oleg Nesterov <oleg@...hat.com>, "Liam R. Howlett"
	<Liam.Howlett@...cle.com>, <linux-kernel@...r.kernel.org>,
	<oliver.sang@...el.com>
Subject: [linux-next:master] [pid]  7903f907a2: stress-ng.pthread.ops_per_sec
 23.4% improvement



Hello,

kernel test robot noticed a 23.4% improvement of stress-ng.pthread.ops_per_sec on:


commit: 7903f907a226058ed99f86e9924e082aea57fc45 ("pid: perform free_pid() calls outside of tasklist_lock")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


testcase: stress-ng
config: x86_64-rhel-9.4
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 256G memory
parameters:

	nr_threads: 100%
	testtime: 60s
	test: pthread
	cpufreq_governor: performance


In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------------------------+
| testcase: change | stress-ng: stress-ng.vfork.ops_per_sec 28.7% improvement                                    |
| test machine     | 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory |
| test parameters  | cpufreq_governor=performance                                                                |
|                  | nr_threads=100%                                                                             |
|                  | test=vfork                                                                                  |
|                  | testtime=60s                                                                                |
+------------------+---------------------------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250219/202502191317.d0050992-lkp@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-r02/pthread/stress-ng/60s

commit: 
  74198dc206 ("pid: sprinkle tasklist_lock asserts")
  7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")

74198dc2067b2aa1 7903f907a226058ed99f86e9924 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 5.953e+08 ±  9%     +82.9%  1.089e+09 ±  3%  cpuidle..time
   3067781 ± 17%    +281.8%   11714061 ±  4%  cpuidle..usage
   3156621 ±  7%     -11.8%    2783051 ±  7%  numa-meminfo.node0.AnonPages
    315502 ±  4%     -11.0%     280901 ±  4%  numa-meminfo.node1.PageTables
      2119 ±  4%     -59.4%     861.38 ±  3%  vmstat.procs.r
    695158           +37.7%     957064        vmstat.system.cs
    786439           +58.8%    1248633        vmstat.system.in
    918265           -31.9%     625741 ± 31%  meminfo.AnonHugePages
   9498433 ±  3%     +13.6%   10786868 ±  3%  meminfo.Cached
 1.188e+09           -11.7%  1.049e+09        meminfo.Committed_AS
   5970512 ±  6%     +21.6%    7258946 ±  4%  meminfo.Shmem
      4.38 ± 11%      +3.8        8.20 ±  3%  mpstat.cpu.all.idle%
      0.47            +0.2        0.67        mpstat.cpu.all.irq%
      0.37 ±  6%      +0.4        0.76 ±  5%  mpstat.cpu.all.soft%
      1.47            +0.3        1.82        mpstat.cpu.all.usr%
  39409396           +21.1%   47737561 ±  2%  numa-numastat.node0.local_node
  39517687           +21.1%   47862366 ±  2%  numa-numastat.node0.numa_hit
  39678016           +22.2%   48499008 ±  2%  numa-numastat.node1.local_node
  39806349           +22.1%   48619579 ±  2%  numa-numastat.node1.numa_hit
     11111 ± 20%     +86.8%      20750 ± 10%  perf-c2c.DRAM.local
      8594 ± 16%     +25.6%      10797 ±  7%  perf-c2c.DRAM.remote
     14151 ± 18%    +100.2%      28336 ±  9%  perf-c2c.HITM.local
      3853 ± 16%     +40.3%       5404 ±  7%  perf-c2c.HITM.remote
     18004 ± 18%     +87.4%      33740 ±  9%  perf-c2c.HITM.total
    785387 ±  8%     -10.5%     702556 ±  7%  numa-vmstat.node0.nr_anon_pages
  39519842           +20.9%   47789798 ±  2%  numa-vmstat.node0.numa_hit
  39411551           +20.9%   47665001 ±  2%  numa-vmstat.node0.numa_local
     78603 ±  3%      -9.8%      70878 ±  5%  numa-vmstat.node1.nr_page_table_pages
  39804028           +22.0%   48541084 ±  2%  numa-vmstat.node1.numa_hit
  39675696           +22.0%   48420524 ±  2%  numa-vmstat.node1.numa_local
    304344 ±  7%     -66.2%     102730 ±  5%  stress-ng.pthread.nanosecs_to_start_a_pthread
  10003318           +23.2%   12323193        stress-ng.pthread.ops
    166143           +23.4%     204943        stress-ng.pthread.ops_per_sec
   4793153           +19.3%    5716581        stress-ng.time.involuntary_context_switches
  21587233           +23.1%   26564025        stress-ng.time.minor_page_faults
     13184           +11.2%      14659        stress-ng.time.percent_of_cpu_this_job_got
      7880           +10.4%       8702        stress-ng.time.system_time
    105.74           +51.1%     159.78        stress-ng.time.user_time
  23363531           +24.5%   29091883        stress-ng.time.voluntary_context_switches
   3104817 ±  2%      +7.0%    3322678 ±  2%  proc-vmstat.nr_active_anon
   1610889            -6.3%    1509476 ±  3%  proc-vmstat.nr_anon_pages
    447.53           -31.7%     305.57 ± 31%  proc-vmstat.nr_anon_transparent_hugepages
   2380189 ±  3%     +13.4%    2699415 ±  3%  proc-vmstat.nr_file_pages
   1794253            -3.7%    1727492        proc-vmstat.nr_kernel_stack
    154819            -9.1%     140710        proc-vmstat.nr_page_table_pages
   1498207 ±  5%     +21.3%    1817432 ±  4%  proc-vmstat.nr_shmem
     47516            +2.5%      48728        proc-vmstat.nr_slab_reclaimable
   3104817 ±  2%      +7.0%    3322678 ±  2%  proc-vmstat.nr_zone_active_anon
    550885 ± 15%     +69.4%     932960 ± 11%  proc-vmstat.numa_hint_faults
    293967 ± 27%     +95.8%     575443 ± 19%  proc-vmstat.numa_hint_faults_local
  79375488           +21.6%   96482937        proc-vmstat.numa_hit
  79138861           +21.6%   96237560        proc-vmstat.numa_local
    330580 ±  9%     +27.1%     420192 ±  5%  proc-vmstat.numa_pages_migrated
    808808 ± 11%     +43.0%    1156712 ±  9%  proc-vmstat.numa_pte_updates
  83384617           +26.0%   1.05e+08        proc-vmstat.pgalloc_normal
  22326472           +22.9%   27448052        proc-vmstat.pgfault
  80530234           +26.2%  1.017e+08        proc-vmstat.pgfree
    330580 ±  9%     +27.1%     420192 ±  5%  proc-vmstat.pgmigrate_success
    261994 ±  8%     +39.8%     366207 ±  7%  proc-vmstat.pgreuse
   4612194 ±  2%     +62.7%    7503881        sched_debug.cfs_rq:/.avg_vruntime.avg
   5440180 ± 13%     +85.6%   10099394 ±  2%  sched_debug.cfs_rq:/.avg_vruntime.max
    501155 ± 64%    +329.5%    2152678 ±  6%  sched_debug.cfs_rq:/.avg_vruntime.stddev
      2.13 ±  9%     -47.3%       1.12 ± 18%  sched_debug.cfs_rq:/.h_nr_queued.avg
     44.33 ± 10%     -55.6%      19.67 ± 47%  sched_debug.cfs_rq:/.h_nr_queued.max
      5.09 ±  5%     -53.8%       2.35 ± 26%  sched_debug.cfs_rq:/.h_nr_queued.stddev
      2.09 ±  9%     -47.9%       1.09 ± 19%  sched_debug.cfs_rq:/.h_nr_runnable.avg
     44.25 ± 10%     -55.7%      19.58 ± 47%  sched_debug.cfs_rq:/.h_nr_runnable.max
      5.05 ±  5%     -54.2%       2.31 ± 27%  sched_debug.cfs_rq:/.h_nr_runnable.stddev
   5340703 ± 12%     +85.8%    9925031 ±  2%  sched_debug.cfs_rq:/.left_deadline.max
   2202572 ±  2%     +55.2%    3417743 ±  9%  sched_debug.cfs_rq:/.left_deadline.stddev
   5340659 ± 12%     +85.8%    9924585 ±  2%  sched_debug.cfs_rq:/.left_vruntime.max
   2202531 ±  2%     +55.2%    3417686 ±  9%  sched_debug.cfs_rq:/.left_vruntime.stddev
    313473 ±  6%     -24.8%     235882 ± 22%  sched_debug.cfs_rq:/.load.avg
   4612199 ±  2%     +62.7%    7503887        sched_debug.cfs_rq:/.min_vruntime.avg
   5440184 ± 13%     +85.6%   10099394 ±  2%  sched_debug.cfs_rq:/.min_vruntime.max
    501154 ± 64%    +329.5%    2152680 ±  6%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.60 ±  6%     -19.5%       0.49 ± 13%  sched_debug.cfs_rq:/.nr_queued.avg
   5340667 ± 12%     +85.8%    9924585 ±  2%  sched_debug.cfs_rq:/.right_vruntime.max
   2202534 ±  2%     +55.2%    3417691 ±  9%  sched_debug.cfs_rq:/.right_vruntime.stddev
    364.26 ±  3%     +16.6%     424.72 ±  2%  sched_debug.cfs_rq:/.util_avg.avg
      1206 ± 23%     +53.8%       1856 ± 26%  sched_debug.cfs_rq:/.util_est.max
    209.57 ±  9%     +27.9%     268.09 ± 11%  sched_debug.cfs_rq:/.util_est.stddev
    360185 ±  5%     +68.1%     605388 ± 15%  sched_debug.cpu.curr->pid.avg
    401600 ±  3%    +120.0%     883327 ±  5%  sched_debug.cpu.curr->pid.stddev
      2.13 ± 10%     -47.0%       1.13 ± 18%  sched_debug.cpu.nr_running.avg
     44.25 ± 10%     -55.6%      19.67 ± 47%  sched_debug.cpu.nr_running.max
      5.08 ±  5%     -53.8%       2.35 ± 25%  sched_debug.cpu.nr_running.stddev
     98005           +37.5%     134753        sched_debug.cpu.nr_switches.avg
    178454 ±  8%    +106.9%     369189 ±  4%  sched_debug.cpu.nr_switches.max
     16050 ± 34%    +376.0%      76393 ±  3%  sched_debug.cpu.nr_switches.stddev
      3.76           +13.7%       4.27        perf-stat.i.MPKI
 1.873e+10            +6.2%  1.989e+10        perf-stat.i.branch-instructions
      0.61            +0.1        0.69        perf-stat.i.branch-miss-rate%
 1.096e+08           +21.8%  1.335e+08        perf-stat.i.branch-misses
     40.32            -2.7       37.62        perf-stat.i.cache-miss-rate%
 3.087e+08           +22.7%  3.787e+08        perf-stat.i.cache-misses
 7.635e+08           +31.5%  1.004e+09        perf-stat.i.cache-references
    712864           +38.1%     984398        perf-stat.i.context-switches
      7.63           -10.6%       6.82        perf-stat.i.cpi
 6.279e+11            -3.7%  6.047e+11        perf-stat.i.cpu-cycles
      2027           -21.4%       1593        perf-stat.i.cycles-between-cache-misses
 8.232e+10            +7.9%  8.881e+10        perf-stat.i.instructions
      0.14           +10.8%       0.15        perf-stat.i.ipc
      8.13           +26.5%      10.29        perf-stat.i.metric.K/sec
    369735           +22.0%     450981        perf-stat.i.minor-faults
    532034           +22.5%     651748        perf-stat.i.page-faults
      3.76           +13.3%       4.26        perf-stat.overall.MPKI
      0.58            +0.1        0.67        perf-stat.overall.branch-miss-rate%
     40.43            -2.7       37.76        perf-stat.overall.cache-miss-rate%
      7.66           -11.4%       6.79        perf-stat.overall.cpi
      2038           -21.8%       1594        perf-stat.overall.cycles-between-cache-misses
      0.13           +12.8%       0.15        perf-stat.overall.ipc
 1.821e+10            +7.3%  1.954e+10        perf-stat.ps.branch-instructions
 1.057e+08           +23.2%  1.302e+08        perf-stat.ps.branch-misses
 3.007e+08           +23.6%  3.717e+08        perf-stat.ps.cache-misses
 7.438e+08           +32.4%  9.845e+08        perf-stat.ps.cache-references
    696299           +38.7%     965478        perf-stat.ps.context-switches
 6.131e+11            -3.4%  5.925e+11        perf-stat.ps.cpu-cycles
     8e+10            +9.0%  8.724e+10        perf-stat.ps.instructions
    356195           +23.6%     440270        perf-stat.ps.minor-faults
    514755           +23.8%     637135        perf-stat.ps.page-faults
 4.867e+12            +9.3%  5.319e+12        perf-stat.total.instructions
     74.42 ± 44%     -60.3       14.16 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     74.41 ± 44%     -60.3       14.16 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.calltrace.cycles-pp.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.calltrace.cycles-pp.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.43 ± 44%     -41.7        4.72 ±223%  perf-profile.calltrace.cycles-pp.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64.entry_SYSCALL_64_after_hwframe
     45.72 ± 44%     -41.2        4.50 ±223%  perf-profile.calltrace.cycles-pp.exit_notify.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
     23.46 ± 44%     -23.5        0.00        perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
     23.34 ± 44%     -23.3        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.exit_notify.do_exit.__x64_sys_exit
     23.33 ± 45%     -23.3        0.00        perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64
     23.24 ± 45%     -23.2        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.copy_process.kernel_clone.__do_sys_clone3
     21.68 ± 44%     -21.7        0.00        perf-profile.calltrace.cycles-pp.queued_write_lock_slowpath.release_task.exit_notify.do_exit.__x64_sys_exit
     21.54 ± 44%     -21.5        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath.queued_write_lock_slowpath.release_task.exit_notify.do_exit
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.calltrace.cycles-pp.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.calltrace.cycles-pp.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
     22.09 ± 44%     -17.6        4.45 ±223%  perf-profile.calltrace.cycles-pp.release_task.exit_notify.do_exit.__x64_sys_exit.x64_sys_call
     26.16 ± 45%     -17.2        8.99 ±223%  perf-profile.calltrace.cycles-pp.copy_process.kernel_clone.__do_sys_clone3.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.calltrace.cycles-pp.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe.__madvise
      5.18 ± 47%      -3.8        1.37 ±223%  perf-profile.calltrace.cycles-pp.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64.entry_SYSCALL_64_after_hwframe
      5.18 ± 47%      -3.8        1.36 ±223%  perf-profile.calltrace.cycles-pp.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise.do_syscall_64
      5.08 ± 47%      -3.7        1.34 ±223%  perf-profile.calltrace.cycles-pp.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise
      5.08 ± 47%      -3.7        1.34 ±223%  perf-profile.calltrace.cycles-pp.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior.do_madvise.__x64_sys_madvise
      5.07 ± 47%      -3.7        1.34 ±223%  perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single.madvise_vma_behavior
      5.06 ± 47%      -3.7        1.33 ±223%  perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.flush_tlb_mm_range.tlb_finish_mmu.zap_page_range_single
     68.48 ± 44%     -68.4        0.09 ±223%  perf-profile.children.cycles-pp.queued_write_lock_slowpath
     81.41 ± 44%     -65.4       16.02 ±223%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     81.40 ± 44%     -65.4       16.01 ±223%  perf-profile.children.cycles-pp.do_syscall_64
     70.40 ± 44%     -57.1       13.32 ±223%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     46.45 ± 44%     -41.7        4.73 ±223%  perf-profile.children.cycles-pp.x64_sys_call
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.children.cycles-pp.do_exit
     46.44 ± 44%     -41.7        4.73 ±223%  perf-profile.children.cycles-pp.__x64_sys_exit
     45.74 ± 44%     -41.2        4.50 ±223%  perf-profile.children.cycles-pp.exit_notify
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.children.cycles-pp.__do_sys_clone3
     27.26 ± 45%     -18.0        9.26 ±223%  perf-profile.children.cycles-pp.kernel_clone
     22.11 ± 44%     -17.7        4.45 ±223%  perf-profile.children.cycles-pp.release_task
     26.18 ± 45%     -17.2        8.99 ±223%  perf-profile.children.cycles-pp.copy_process
      5.38 ± 47%      -4.0        1.38 ±223%  perf-profile.children.cycles-pp.tlb_finish_mmu
      5.30 ± 47%      -3.9        1.36 ±223%  perf-profile.children.cycles-pp.on_each_cpu_cond_mask
      5.30 ± 47%      -3.9        1.36 ±223%  perf-profile.children.cycles-pp.smp_call_function_many_cond
      5.30 ± 47%      -3.9        1.37 ±223%  perf-profile.children.cycles-pp.flush_tlb_mm_range
      5.25 ± 47%      -3.9        1.38 ±223%  perf-profile.children.cycles-pp.__madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.children.cycles-pp.__x64_sys_madvise
      5.24 ± 47%      -3.9        1.38 ±223%  perf-profile.children.cycles-pp.do_madvise
      5.18 ± 47%      -3.8        1.37 ±223%  perf-profile.children.cycles-pp.madvise_vma_behavior
      5.18 ± 47%      -3.8        1.36 ±223%  perf-profile.children.cycles-pp.zap_page_range_single
     70.39 ± 44%     -57.1       13.32 ±223%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      5.16 ± 47%      -3.9        1.30 ±223%  perf-profile.self.cycles-pp.smp_call_function_many_cond


***************************************************************************************************
lkp-spr-2sp4: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480CTDX (Sapphire Rapids) with 512G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/nr_threads/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-9.4/100%/debian-12-x86_64-20240206.cgz/lkp-spr-2sp4/vfork/stress-ng/60s

commit: 
  74198dc206 ("pid: sprinkle tasklist_lock asserts")
  7903f907a2 ("pid: perform free_pid() calls outside of tasklist_lock")

74198dc2067b2aa1 7903f907a226058ed99f86e9924 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   6562366 ±  8%     +37.0%    8993652 ± 10%  cpuidle..usage
      0.29            +0.1        0.39        mpstat.cpu.all.soft%
    486692           +31.8%     641303        vmstat.system.cs
    506323            +4.8%     530409        vmstat.system.in
   4004574 ±  3%      +8.7%    4353640 ±  3%  meminfo.Active
   4004574 ±  3%      +8.7%    4353640 ±  3%  meminfo.Active(anon)
   2657761 ±  6%     +15.5%    3069404 ±  5%  meminfo.Shmem
   3257759 ± 11%     +14.3%    3724594 ±  7%  numa-meminfo.node1.Active
   3257759 ± 11%     +14.3%    3724594 ±  7%  numa-meminfo.node1.Active(anon)
   2492828 ±  9%     +21.0%    3017306 ±  6%  numa-meminfo.node1.Shmem
   9063611 ±  2%     +36.5%   12368884 ±  9%  numa-numastat.node0.local_node
   9220375 ±  2%     +35.7%   12513653 ±  9%  numa-numastat.node0.numa_hit
  10168176           +28.3%   13044773        numa-numastat.node1.local_node
  10243149           +28.2%   13131946        numa-numastat.node1.numa_hit
      5700 ±  8%     +47.9%       8432 ± 11%  perf-c2c.DRAM.remote
     14297 ±  7%     +42.5%      20373 ± 12%  perf-c2c.HITM.local
      3624 ±  8%     +54.4%       5597 ± 11%  perf-c2c.HITM.remote
     17922 ±  7%     +44.9%      25970 ± 12%  perf-c2c.HITM.total
     51838 ± 45%     -56.5%      22543 ±105%  numa-vmstat.node0.nr_mapped
   9221619 ±  2%     +35.2%   12469913 ±  9%  numa-vmstat.node0.numa_hit
   9064856 ±  2%     +36.0%   12325144 ± 10%  numa-vmstat.node0.numa_local
    623443 ±  9%     +20.6%     752138 ±  6%  numa-vmstat.node1.nr_shmem
  10243633           +27.8%   13088671        numa-vmstat.node1.numa_hit
  10168660           +27.9%   13001498        numa-vmstat.node1.numa_local
   1378378           +18.3%    1630343        stress-ng.time.involuntary_context_switches
     10647            -3.1%      10321        stress-ng.time.system_time
      1838           +13.8%       2092        stress-ng.time.user_time
  16431508           +30.8%   21498222        stress-ng.time.voluntary_context_switches
   8890752           +28.7%   11442483        stress-ng.vfork.ops
    148177           +28.7%     190706        stress-ng.vfork.ops_per_sec
   1000826 ±  3%      +8.9%    1090125 ±  3%  proc-vmstat.nr_active_anon
   1545626 ±  2%      +6.8%    1650840 ±  2%  proc-vmstat.nr_file_pages
    120475            +2.9%     124024        proc-vmstat.nr_mapped
    663632 ±  6%     +15.9%     768846 ±  5%  proc-vmstat.nr_shmem
   1000826 ±  3%      +8.9%    1090125 ±  3%  proc-vmstat.nr_zone_active_anon
  19510114           +31.5%   25647538 ±  4%  proc-vmstat.numa_hit
  19278378           +31.8%   25415597 ±  4%  proc-vmstat.numa_local
  22280233           +32.9%   29608930 ±  4%  proc-vmstat.pgalloc_normal
  20644303           +35.1%   27885848 ±  4%  proc-vmstat.pgfree
      1.03           +18.9%       1.22 ±  2%  perf-stat.i.MPKI
 1.703e+10            +6.2%  1.809e+10        perf-stat.i.branch-instructions
      0.53 ±  2%      +0.1        0.59 ±  4%  perf-stat.i.branch-miss-rate%
  88001361 ±  3%     +17.3%  1.032e+08 ±  5%  perf-stat.i.branch-misses
  74412375           +27.9%   95182974        perf-stat.i.cache-misses
 7.674e+08 ±  3%     +26.4%  9.698e+08 ±  4%  perf-stat.i.cache-references
    503132           +32.0%     664329        perf-stat.i.context-switches
      8.49            -7.5%       7.85        perf-stat.i.cpi
    112807 ±  2%     +23.7%     139583 ±  5%  perf-stat.i.cpu-migrations
      8617           -23.1%       6627        perf-stat.i.cycles-between-cache-misses
 7.368e+10            +7.4%  7.917e+10        perf-stat.i.instructions
      0.12            +8.3%       0.13        perf-stat.i.ipc
      2.25           +31.7%       2.97        perf-stat.i.metric.K/sec
      1.02           +18.9%       1.21        perf-stat.overall.MPKI
      0.50 ±  2%      +0.1        0.56 ±  3%  perf-stat.overall.branch-miss-rate%
      8.55            -7.5%       7.91        perf-stat.overall.cpi
      8374           -22.2%       6517        perf-stat.overall.cycles-between-cache-misses
      0.12            +8.1%       0.13        perf-stat.overall.ipc
 1.655e+10            +6.2%  1.758e+10        perf-stat.ps.branch-instructions
  82996740 ±  3%     +17.8%   97762479 ±  5%  perf-stat.ps.branch-misses
  73065238           +27.7%   93297913        perf-stat.ps.cache-misses
 7.509e+08 ±  3%     +26.3%  9.487e+08 ±  4%  perf-stat.ps.cache-references
    491567           +32.0%     649035        perf-stat.ps.context-switches
    110242 ±  2%     +23.6%     136250 ±  4%  perf-stat.ps.cpu-migrations
 7.159e+10            +7.4%   7.69e+10        perf-stat.ps.instructions
     11850 ±  2%      +6.0%      12559 ±  3%  perf-stat.ps.minor-faults
     11850 ±  2%      +6.0%      12559 ±  3%  perf-stat.ps.page-faults
 4.334e+12            +8.1%  4.684e+12        perf-stat.total.instructions
      0.55 ± 10%     -29.3%       0.39 ± 13%  perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.80 ±  3%     -31.4%       0.55 ±  6%  perf-sched.sch_delay.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.94 ±  3%     -31.1%       0.65 ±  2%  perf-sched.sch_delay.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
      0.30 ±  2%     -14.5%       0.26 ±  4%  perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
      0.37           -28.9%       0.27        perf-sched.sch_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.81 ± 12%     -28.8%       0.58 ± 10%  perf-sched.sch_delay.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.76 ±  4%     -43.4%       0.43 ±  3%  perf-sched.sch_delay.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
      0.42 ± 16%     -45.4%       0.23 ± 15%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
      0.81           -38.6%       0.50 ±  5%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.alloc_pid.copy_process.kernel_clone
      0.92           -31.7%       0.63 ±  8%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
      0.87 ±  3%     -33.4%       0.58 ±  8%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
      0.86 ±  8%     -32.5%       0.58 ±  7%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
      0.96 ±  5%     -36.0%       0.61 ±  4%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
      0.85           -38.0%       0.53 ±  3%  perf-sched.sch_delay.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
      0.34 ± 33%     -57.1%       0.15 ± 82%  perf-sched.sch_delay.avg.ms.__cond_resched.kvfree_rcu_drain_ready.kfree_rcu_monitor.process_one_work.worker_thread
      0.04 ±  3%     -20.9%       0.04 ±  6%  perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.17 ±  9%     -31.5%       0.11 ± 16%  perf-sched.sch_delay.avg.ms.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown].[unknown]
      0.23           -18.1%       0.19        perf-sched.sch_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.30           -20.7%       0.24 ±  2%  perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
      0.10 ±  6%     -18.2%       0.08 ±  5%  perf-sched.sch_delay.avg.ms.syscall_exit_to_user_mode.ret_from_fork_asm.[unknown].[unknown]
      0.13           -18.4%       0.11 ±  2%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      1.64 ± 33%     -34.6%       1.07 ± 20%  perf-sched.sch_delay.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
      0.43 ± 28%     -41.7%       0.25 ± 31%  perf-sched.sch_delay.max.ms.__cond_resched.mmput.exit_mm.do_exit.__x64_sys_exit
      0.78 ± 19%     -42.2%       0.45 ± 25%  perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part
      0.13           -20.3%       0.10        perf-sched.total_sch_delay.average.ms
     59.45 ± 12%     -21.3%      46.77 ±  9%  perf-sched.total_sch_delay.max.ms
      2.32           -18.5%       1.89        perf-sched.total_wait_and_delay.average.ms
   1656374           +26.0%    2087010        perf-sched.total_wait_and_delay.count.ms
      2.20           -18.4%       1.79        perf-sched.total_wait_time.average.ms
      0.90           -26.7%       0.66        perf-sched.wait_and_delay.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
     20.62 ±  6%     -43.0%      11.74 ±  2%  perf-sched.wait_and_delay.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.17 ±  2%     -18.4%       0.14 ±  5%  perf-sched.wait_and_delay.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
     60.43 ± 19%     +76.4%     106.62 ± 33%  perf-sched.wait_and_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.65           -18.1%       0.53        perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     56.03 ±  3%     -45.1%      30.75 ±  2%  perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.89 ±  3%     -17.5%       0.73 ±  7%  perf-sched.wait_and_delay.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     10.82           -15.3%       9.17        perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
     33654            -9.5%      30471        perf-sched.wait_and_delay.count.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      1689 ±  8%    +168.2%       4529 ±  8%  perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     59.50 ±  6%     +39.5%      83.00 ± 11%  perf-sched.wait_and_delay.count.__cond_resched.vunmap_p4d_range.__vunmap_range_noflush.remove_vm_area.vfree
    675414           +24.7%     842197        perf-sched.wait_and_delay.count.do_task_dead.do_exit.__x64_sys_exit.x64_sys_call.do_syscall_64
     69934 ±  4%     +46.4%     102383 ±  6%  perf-sched.wait_and_delay.count.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1118 ± 19%     -36.7%     708.00 ± 28%  perf-sched.wait_and_delay.count.pipe_read.vfs_read.ksys_read.do_syscall_64
    652564           +25.8%     821118        perf-sched.wait_and_delay.count.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     36347 ±  3%     +89.4%      68847 ±  2%  perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
     62439           +16.9%      72971        perf-sched.wait_and_delay.count.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
    104431           +18.2%     123395        perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      3.18 ±183%     -87.1%       0.41 ± 14%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_cache_node_noprof.__get_vm_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.83 ±  3%     -30.1%       0.58 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.__kmalloc_node_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      1.28 ± 57%     -47.5%       0.67 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node.dup_task_struct
      0.52           -25.1%       0.39        perf-sched.wait_time.avg.ms.__cond_resched.__wait_for_common.wait_for_completion_state.kernel_clone.__x64_sys_vfork
      0.85 ± 12%     -34.9%       0.55 ± 17%  perf-sched.wait_time.avg.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      0.80 ±  5%     -37.6%       0.50        perf-sched.wait_time.avg.ms.__cond_resched.cgroup_css_set_fork.cgroup_can_fork.copy_process.kernel_clone
      0.79 ± 26%     -37.0%       0.50 ± 19%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.alloc_vmap_area.__get_vm_area_node.__vmalloc_node_range_noprof
      0.51 ±  9%     -42.1%       0.30 ± 12%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
      0.94           -31.8%       0.64 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_fs_struct.copy_process.kernel_clone
      0.90 ±  2%     -32.1%       0.61 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_sighand.copy_process.kernel_clone
      0.89 ±  8%     -31.4%       0.61 ±  6%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.copy_signal.copy_process.kernel_clone
      0.96 ±  2%     -33.2%       0.64 ±  4%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.dup_fd.copy_process.kernel_clone
      0.88           -34.6%       0.57 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_noprof.prepare_creds.copy_creds.copy_process
     20.58 ±  6%     -43.1%      11.71 ±  2%  perf-sched.wait_time.avg.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.13 ±  3%     -17.6%       0.11 ±  4%  perf-sched.wait_time.avg.ms.do_wait.kernel_wait4.do_syscall_64.entry_SYSCALL_64_after_hwframe
     60.37 ± 19%     +76.5%     106.54 ± 33%  perf-sched.wait_time.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.41           -17.9%       0.34        perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
     55.91 ±  3%     -45.2%      30.65 ±  2%  perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.58 ±  6%     -15.8%       0.49 ± 11%  perf-sched.wait_time.avg.ms.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown]
     10.69           -15.3%       9.06        perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      1.25           -23.3%       0.96 ± 13%  perf-sched.wait_time.max.ms.__cond_resched.alloc_pages_bulk_noprof.__vmalloc_area_node.__vmalloc_node_range_noprof.alloc_thread_stack_node
      1.65 ± 34%     -34.4%       1.08 ± 19%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node_noprof.dup_task_struct.copy_process.kernel_clone
     44.32 ± 19%     -26.5%      32.59 ± 11%  perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ