lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160622212415.GA24150@cmpxchg.org>
Date:	Wed, 22 Jun 2016 17:24:15 -0400
From:	Johannes Weiner <hannes@...xchg.org>
To:	Ye Xiaolong <xiaolong.ye@...el.com>
Cc:	Rik van Riel <riel@...hat.com>, lkp@...org,
	LKML <linux-kernel@...r.kernel.org>,
	Mel Gorman <mgorman@...e.de>,
	David Rientjes <rientjes@...gle.com>,
	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [LKP] [lkp] [mm] 795ae7a0de: pixz.throughput -9.1% regression

Hi,

On Wed, Jun 08, 2016 at 01:37:26PM +0800, Ye Xiaolong wrote:
> On Tue, Jun 07, 2016 at 05:56:27PM -0400, Johannes Weiner wrote:
> >But just to make sure I'm looking at the right code, can you first try
> >the following patch on top of Linus's current tree and see if that
> >gets performance back to normal? It's a partial revert of the
> >watermarks that singles out the fair zone allocator:
> 
> Seems that this patch doesn't help to gets performance back.
> I've attached the comparison result among 3ed3a4f, 795ae7ay, v4.7-rc2 and
> 1fe49ba5 ("mm: revert fairness batching to before the watermarks were")
> with perf profile information.  You can find it via searching 'perf-profile'.

Sorry for the delay, and thank you for running these. I still can't
reproduce this.

> 3ed3a4f0ddffece9 795ae7a0de6b834a0cc202aa55                   v4.7-rc2 1fe49ba5002a50aefd5b6c4913
> ---------------- -------------------------- -------------------------- --------------------------
>        fail:runs  %reproduction    fail:runs  %reproduction    fail:runs  %reproduction    fail:runs
>            |             |             |             |             |             |             |
>            :4            0%            :7            0%            :4           50%           2:4     kmsg.DHCP/BOOTP:Reply_not_for_us,op[#]xid[#]
>            :4           50%           2:7            0%            :4            0%            :4     kmsg.Spurious_LAPIC_timer_interrupt_on_cpu
>            :4            0%            :7           14%           1:4           25%           1:4     kmsg.igb#:#:#:exceed_max#second
>          %stddev     %change         %stddev     %change         %stddev     %change         %stddev
>              \          |                \          |                \          |                \
>   78505362 ±  0%      -9.2%   71298182 ±  0%     -11.8%   69280014 ±  0%      -9.1%   71350485 ±  0%  pixz.throughput
>    5586220 ±  2%      -1.6%    5498492 ±  2%      +6.5%    5950210 ±  1%      +8.4%    6052963 ±  1%  pixz.time.involuntary_context_switches
>    4582198 ±  2%      -3.6%    4416275 ±  2%      -8.6%    4189304 ±  4%      -8.0%    4214839 ±  0%  pixz.time.minor_page_faults
>       4530 ±  0%      +1.0%       4575 ±  0%      -1.6%       4458 ±  0%      -1.3%       4469 ±  0%  pixz.time.percent_of_cpu_this_job_got
>      92.03 ±  0%      +5.6%      97.23 ± 11%     +31.3%     120.83 ±  1%     +30.4%     119.98 ±  0%  pixz.time.system_time
>      14911 ±  0%      +2.1%      15218 ±  0%      -1.0%      14759 ±  1%      -1.0%      14764 ±  0%  pixz.time.user_time
>    6586930 ±  0%      -8.4%    6033444 ±  1%      -4.4%    6295529 ±  1%      -2.6%    6416460 ±  1%  pixz.time.voluntary_context_switches
>    2179703 ±  4%      +4.8%    2285049 ±  2%     -15.3%    1846752 ± 16%      -8.2%    2000913 ±  4%  softirqs.RCU
>      92.03 ±  0%      +5.6%      97.23 ± 11%     +31.3%     120.83 ±  1%     +30.4%     119.98 ±  0%  time.system_time
>       2237 ±  2%      -2.9%       2172 ±  7%     +16.3%       2601 ±  7%      +8.0%       2416 ±  6%  uptime.idle
>      49869 ±  1%     -12.6%      43583 ±  8%     -18.0%      40917 ±  0%     -16.3%      41728 ±  1%  vmstat.system.cs
>      97890 ±  1%      -0.0%      97848 ±  3%      +7.4%     105143 ±  2%      +6.8%     104518 ±  2%  vmstat.system.in
>     105682 ±  1%      +0.6%     106297 ±  1%     -85.2%      15631 ±  4%     -85.1%      15768 ±  1%  meminfo.Active(file)
>     390126 ±  0%      -0.2%     389529 ±  0%     +23.9%     483296 ±  0%     +23.9%     483194 ±  0%  meminfo.Inactive
>     380750 ±  0%      -0.2%     380141 ±  0%     +24.5%     473891 ±  0%     +24.4%     473760 ±  0%  meminfo.Inactive(file)
>       2401 ±107%     +76.9%       4247 ± 79%     -99.8%       5.75 ± 18%     -99.7%       6.75 ± 39%  numa-numastat.node0.other_node
>    2074670 ±  2%     -11.3%    1840052 ± 11%     -21.1%    1637071 ± 12%     -22.5%    1607724 ±  7%  numa-numastat.node1.local_node
>    2081648 ±  2%     -11.4%    1844923 ± 11%     -21.4%    1637081 ± 12%     -22.8%    1607730 ±  7%  numa-numastat.node1.numa_hit
>       6977 ± 36%     -30.2%       4871 ± 66%     -99.8%      10.50 ± 17%     -99.9%       5.50 ± 20%  numa-numastat.node1.other_node
>   13061458 ± 19%      -3.3%   12634644 ± 24%     +33.5%   17435714 ± 47%     +58.3%   20674526 ± 14%  cpuidle.C1-IVT.time
>     193807 ± 15%     +26.8%     245657 ± 76%    +101.8%     391021 ±  8%    +115.5%     417669 ± 20%  cpuidle.C1-IVT.usage
>  8.866e+08 ±  2%     -15.6%  7.479e+08 ±  6%     +25.0%  1.108e+09 ±  5%     +21.0%  1.073e+09 ±  4%  cpuidle.C6-IVT.time
>      93283 ±  0%     -13.2%      80988 ±  3%    +300.6%     373726 ±121%     +20.8%     112719 ±  1%  cpuidle.C6-IVT.usage
>    8559466 ± 20%     -39.3%    5195127 ± 40%     -98.1%     159481 ±173%    -100.0%      97.50 ± 40%  cpuidle.POLL.time
>     771388 ±  9%     -53.4%     359081 ± 52%     -99.9%     959.00 ±167%    -100.0%      40.50 ± 39%  cpuidle.POLL.usage
>      94.35 ±  0%      +1.0%      95.28 ±  0%      -1.6%      92.81 ±  0%      -1.4%      93.00 ±  0%  turbostat.%Busy
>       2824 ±  0%      +1.0%       2851 ±  0%      -1.6%       2777 ±  0%      -1.4%       2784 ±  0%  turbostat.Avg_MHz
>       3.57 ±  3%     -20.9%       2.83 ±  6%     +18.6%       4.24 ±  6%      +9.4%       3.91 ±  4%  turbostat.CPU%c1
>       2.07 ±  3%      -8.8%       1.89 ± 10%     +42.0%       2.95 ± 13%     +48.7%       3.08 ±  4%  turbostat.CPU%c6
>     157.67 ±  0%      -0.7%     156.51 ±  0%      -1.4%     155.47 ±  0%      -1.4%     155.39 ±  0%  turbostat.CorWatt
>       0.17 ± 17%      -2.9%       0.17 ± 23%    +151.4%       0.44 ± 23%     +88.6%       0.33 ± 11%  turbostat.Pkg%pc2
>     192.71 ±  0%      -0.8%     191.15 ±  0%      -1.4%     190.10 ±  0%      -1.3%     190.12 ±  0%  turbostat.PkgWatt
>      22.36 ±  0%      -8.4%      20.49 ±  0%     -10.3%      20.05 ±  0%      -8.1%      20.55 ±  0%  turbostat.RAMWatt
>      53301 ±  2%      +0.3%      53439 ±  5%     -85.3%       7826 ±  4%     -85.2%       7898 ±  1%  numa-meminfo.node0.Active(file)
>     194536 ±  2%      +0.8%     196145 ±  2%     +24.4%     241970 ±  1%     +25.2%     243537 ±  1%  numa-meminfo.node0.Inactive
>     189951 ±  0%      -0.1%     189801 ±  1%     +24.7%     236921 ±  0%     +24.7%     236864 ±  0%  numa-meminfo.node0.Inactive(file)
>      10240 ±  2%      -1.0%      10138 ±  3%     -16.2%       8580 ±  3%     -17.6%       8442 ±  1%  numa-meminfo.node0.KernelStack
>      26406 ±  4%      -8.4%      24183 ±  7%     -10.2%      23723 ±  2%      -4.7%      25152 ±  5%  numa-meminfo.node0.SReclaimable
>      52381 ±  1%      +0.9%      52856 ±  3%     -85.1%       7804 ±  4%     -85.0%       7867 ±  2%  numa-meminfo.node1.Active(file)
>     195602 ±  2%      -1.1%     193393 ±  2%     +23.4%     241343 ±  1%     +22.5%     239683 ±  1%  numa-meminfo.node1.Inactive
>     190797 ±  0%      -0.2%     190340 ±  1%     +24.2%     236969 ±  0%     +24.2%     236897 ±  0%  numa-meminfo.node1.Inactive(file)
>       4188 ±  6%      +2.4%       4289 ±  5%     +42.2%       5955 ±  4%     +45.0%       6073 ±  2%  numa-meminfo.node1.KernelStack
>      22906 ±  4%     +10.5%      25314 ±  6%     +13.4%      25980 ±  2%      +8.5%      24850 ±  5%  numa-meminfo.node1.SReclaimable
>      13324 ±  2%      +0.3%      13359 ±  5%     -85.3%       1956 ±  4%     -85.2%       1974 ±  1%  numa-vmstat.node0.nr_active_file
>     454.25 ±  2%    +773.4%       3967 ±  2%    +794.3%       4062 ±  3%    +194.4%       1337 ±  2%  numa-vmstat.node0.nr_alloc_batch
>      47488 ±  0%      -0.1%      47449 ±  1%     +24.7%      59229 ±  0%     +24.7%      59215 ±  0%  numa-vmstat.node0.nr_inactive_file
>     639.25 ±  2%      -1.0%     633.00 ±  3%     -16.2%     536.00 ±  3%     -17.5%     527.25 ±  1%  numa-vmstat.node0.nr_kernel_stack
>       6600 ±  4%      -8.4%       6045 ±  7%     -10.2%       5930 ±  2%      -4.7%       6287 ±  5%  numa-vmstat.node0.nr_slab_reclaimable
>      69675 ±  3%      +3.0%      71759 ±  4%    -100.0%       2.50 ± 66%    -100.0%       3.75 ± 51%  numa-vmstat.node0.numa_other
>      13094 ±  1%      +0.9%      13213 ±  3%     -85.1%       1950 ±  4%     -85.0%       1966 ±  2%  numa-vmstat.node1.nr_active_file
>     563.00 ±  2%    +642.6%       4181 ±  3%    +631.5%       4118 ±  4%    +162.8%       1479 ±  2%  numa-vmstat.node1.nr_alloc_batch
>      47699 ±  0%      -0.2%      47584 ±  1%     +24.2%      59241 ±  0%     +24.2%      59223 ±  0%  numa-vmstat.node1.nr_inactive_file
>     261.25 ±  6%      +2.4%     267.57 ±  5%     +42.3%     371.75 ±  4%     +45.3%     379.50 ±  2%  numa-vmstat.node1.nr_kernel_stack
>       5726 ±  4%     +10.5%       6328 ±  6%     +13.4%       6495 ±  2%      +8.5%       6212 ±  5%  numa-vmstat.node1.nr_slab_reclaimable
>    1254802 ±  3%      -9.6%    1134298 ± 10%     -19.6%    1008654 ±  9%     -21.0%     990900 ±  5%  numa-vmstat.node1.numa_hit
>    1232554 ±  3%      -9.6%    1113884 ± 10%     -18.2%    1008648 ±  9%     -19.6%     990898 ±  5%  numa-vmstat.node1.numa_local
>      22247 ± 11%      -8.2%      20414 ± 16%    -100.0%       5.75 ± 18%    -100.0%       1.75 ± 24%  numa-vmstat.node1.numa_other
>      26419 ±  1%      +0.6%      26573 ±  1%     -85.2%       3907 ±  4%     -85.1%       3941 ±  1%  proc-vmstat.nr_active_file
>     946.75 ±  3%    +764.9%       8188 ±  2%    +745.5%       8004 ±  1%    +196.1%       2803 ±  1%  proc-vmstat.nr_alloc_batch
>      95188 ±  0%      -0.2%      95035 ±  0%     +24.5%     118472 ±  0%     +24.4%     118440 ±  0%  proc-vmstat.nr_inactive_file
>    3005733 ±  3%      -4.4%    2872963 ±  2%     -14.6%    2566600 ±  6%     -13.9%    2587727 ±  1%  proc-vmstat.numa_hint_faults_local
>    3652636 ±  1%      -4.4%    3492233 ±  2%     -16.5%    3049926 ±  2%     -14.0%    3139498 ±  0%  proc-vmstat.numa_hit
>    3643257 ±  1%      -4.4%    3483323 ±  2%     -16.3%    3049910 ±  2%     -13.8%    3139486 ±  0%  proc-vmstat.numa_local
>       9379 ±  0%      -5.0%       8909 ± 12%     -99.8%      16.25 ±  7%     -99.9%      12.25 ± 27%  proc-vmstat.numa_other
>    4924994 ±  3%      +0.9%    4966927 ±  9%     +38.2%    6804572 ±  5%     +38.7%    6831202 ±  4%  proc-vmstat.numa_pages_migrated
>       8510 ±  0%      +1.5%       8638 ±  1%     -27.1%       6204 ± 31%     -11.2%       7554 ±  1%  proc-vmstat.pgactivate
>    2403080 ±  2%     -58.7%     993450 ±  2%     -57.0%    1033978 ±  4%     -39.3%    1457730 ±  3%  proc-vmstat.pgalloc_dma32
>   15038432 ±  0%      +8.1%   16250009 ±  3%     +16.9%   17583879 ±  2%     +14.9%   17277548 ±  1%  proc-vmstat.pgalloc_normal
>      32128 ± 22%     +41.4%      45421 ± 21%    +391.6%     157952 ±  9%    +333.9%     139392 ± 11%  proc-vmstat.pgmigrate_fail
>    4924994 ±  3%      +0.9%    4966927 ±  9%     +38.2%    6804572 ±  5%     +38.7%    6831202 ±  4%  proc-vmstat.pgmigrate_success
>      25886 ±  2%      -1.2%      25585 ±  4%     +12.0%      28981 ±  2%     +12.5%      29132 ±  2%  proc-vmstat.thp_deferred_split_page
>     632.75 ±  3%      -1.6%     622.43 ±  3%     -18.1%     518.50 ±  2%     -19.4%     510.00 ±  0%  slabinfo.RAW.active_objs
>     632.75 ±  3%      -1.6%     622.43 ±  3%     -18.1%     518.50 ±  2%     -19.4%     510.00 ±  0%  slabinfo.RAW.num_objs
>       1512 ±  1%      -0.6%       1502 ±  1%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.UNIX.active_objs
>       1512 ±  1%      -0.6%       1502 ±  1%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.UNIX.num_objs
>     766.50 ± 10%      +7.5%     823.86 ± 10%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.avc_xperms_node.active_objs
>     766.50 ± 10%      +7.5%     823.86 ± 10%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  slabinfo.avc_xperms_node.num_objs
>     507.00 ±  9%     +16.1%     588.57 ± 10%     +21.1%     614.00 ±  4%      +3.1%     522.75 ± 10%  slabinfo.file_lock_cache.active_objs
>     507.00 ±  9%     +16.1%     588.57 ± 10%     +21.1%     614.00 ±  4%      +3.1%     522.75 ± 10%  slabinfo.file_lock_cache.num_objs
>      13334 ±  4%      +6.9%      14255 ±  4%     +12.1%      14952 ±  4%     +11.8%      14907 ± 11%  slabinfo.kmalloc-512.num_objs
>     357.00 ±  2%      +0.7%     359.43 ±  1%     +35.2%     482.75 ±  0%     +33.9%     478.00 ±  0%  slabinfo.kmalloc-8192.num_objs
>       8080 ±  3%      +1.9%       8233 ±  4%     +16.8%       9441 ±  5%     +17.2%       9470 ±  1%  slabinfo.kmalloc-96.active_objs
>       8125 ±  3%      +1.9%       8281 ±  4%     +16.8%       9488 ±  5%     +17.1%       9511 ±  1%  slabinfo.kmalloc-96.num_objs
>       1112 ±  4%      -2.5%       1084 ±  5%      +6.6%       1186 ± 11%     +11.9%       1244 ±  2%  slabinfo.task_group.active_objs
>       1112 ±  4%      -2.5%       1084 ±  5%      +6.6%       1186 ± 11%     +11.9%       1244 ±  2%  slabinfo.task_group.num_objs
>      18.81 ±  7%     +12.4%      21.13 ± 28%  +4.5e+06%     837325 ±  3%  +4.5e+06%     846688 ±  0%  sched_debug.cfs_rq:/.load.avg
>      90.42 ± 75%     +88.7%     170.62 ±137%  +1.1e+06%    1028138 ±  0%  +1.3e+06%    1157227 ± 19%  sched_debug.cfs_rq:/.load.max
>      10.83 ± 25%     +13.6%      12.31 ± 12%  +4.6e+06%     500135 ± 62%  +5.8e+06%     625582 ± 11%  sched_debug.cfs_rq:/.load.min
>      12.00 ± 81%     +96.8%      23.63 ±144%  +9.6e+05%     115762 ± 29%    +8e+05%      96269 ± 29%  sched_debug.cfs_rq:/.load.stddev
>      26.71 ± 11%      -8.9%      24.33 ± 10%   +2902.0%     801.76 ±  2%   +2935.3%     810.66 ±  0%  sched_debug.cfs_rq:/.load_avg.avg
>     241.42 ± 34%     -32.4%     163.29 ± 54%    +294.1%     951.38 ±  2%    +299.9%     965.33 ±  3%  sched_debug.cfs_rq:/.load_avg.max
>      14.13 ±  5%      +4.8%      14.81 ±  4%   +3872.3%     561.08 ± 18%   +4326.5%     625.25 ±  5%  sched_debug.cfs_rq:/.load_avg.min
>      37.52 ± 37%     -35.7%      24.15 ± 48%    +103.7%      76.43 ± 19%     +49.9%      56.27 ±  7%  sched_debug.cfs_rq:/.load_avg.stddev
>    6864771 ±  0%      +1.6%    6971358 ±  0%     -97.9%     146805 ±  0%     -97.9%     147296 ±  0%  sched_debug.cfs_rq:/.min_vruntime.avg
>    6984488 ±  0%      +1.2%    7071775 ±  0%     -97.7%     158812 ±  0%     -97.7%     160483 ±  1%  sched_debug.cfs_rq:/.min_vruntime.max
>    6522931 ±  1%      +1.2%    6598038 ±  1%     -97.8%     141019 ±  1%     -97.8%     141943 ±  0%  sched_debug.cfs_rq:/.min_vruntime.min
>      80297 ±  7%      -5.5%      75882 ± 12%     -95.3%       3775 ±  7%     -95.4%       3703 ±  9%  sched_debug.cfs_rq:/.min_vruntime.stddev
>      16.76 ±  1%      +1.4%      16.98 ±  0%   +4570.6%     782.68 ±  2%   +4662.4%     798.07 ±  0%  sched_debug.cfs_rq:/.runnable_load_avg.avg
>      28.88 ±  7%     +10.6%      31.93 ±  7%   +3065.5%     914.04 ±  3%   +3138.1%     935.00 ±  1%  sched_debug.cfs_rq:/.runnable_load_avg.max
>       9.54 ± 28%     +18.5%      11.31 ± 11%   +4105.7%     401.29 ± 62%   +5300.0%     515.25 ± 13%  sched_debug.cfs_rq:/.runnable_load_avg.min
>       2.92 ± 11%      +2.0%       2.98 ± 10%   +3057.4%      92.11 ± 45%   +2153.7%      65.75 ± 13%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
>      83900 ± 25%     -46.8%      44629 ± 65%     -98.9%     894.85 ±155%     -99.8%     201.22 ±111%  sched_debug.cfs_rq:/.spread0.max
>    -377675 ±-21%     +13.7%    -429229 ±-20%     -95.5%     -16912 ± -5%     -95.1%     -18353 ±-11%  sched_debug.cfs_rq:/.spread0.min
>      80284 ±  7%      -5.5%      75895 ± 12%     -95.3%       3778 ±  7%     -95.4%       3707 ±  9%  sched_debug.cfs_rq:/.spread0.stddev
>      81.92 ± 23%     -30.5%      56.96 ± 12%      -6.6%      76.55 ± 29%     -28.3%      58.74 ±  9%  sched_debug.cfs_rq:/.util_avg.stddev
>     249892 ± 16%      -2.1%     244699 ± 33%     +94.1%     485129 ± 13%    +114.3%     535496 ± 22%  sched_debug.cpu.avg_idle.min
>     149745 ±  9%     +13.0%     169186 ±  7%     -28.0%     107794 ± 16%      -8.4%     137183 ± 70%  sched_debug.cpu.avg_idle.stddev
>       2.94 ± 10%     +21.0%       3.56 ± 33%    +107.3%       6.10 ±  7%     +84.9%       5.44 ± 11%  sched_debug.cpu.clock.stddev
>       2.94 ± 10%     +21.0%       3.56 ± 33%    +107.3%       6.10 ±  7%     +84.9%       5.44 ± 11%  sched_debug.cpu.clock_task.stddev
>      17.64 ±  9%      -0.9%      17.48 ±  7%   +4333.8%     781.92 ±  2%   +4425.8%     798.14 ±  0%  sched_debug.cpu.cpu_load[0].avg
>      69.46 ±103%     -20.3%      55.38 ±102%   +1216.0%     914.04 ±  3%   +1246.3%     935.08 ±  1%  sched_debug.cpu.cpu_load[0].max
>      11.08 ± 24%     +25.0%      13.86 ± 12%   +3294.7%     376.25 ± 67%   +4554.1%     515.83 ± 12%  sched_debug.cpu.cpu_load[0].min
>       8.49 ±115%     -28.6%       6.06 ±132%   +1028.8%      95.82 ± 43%    +674.4%      65.73 ± 13%  sched_debug.cpu.cpu_load[0].stddev
>      17.31 ±  5%      -0.1%      17.29 ±  3%   +4472.2%     791.32 ±  2%   +4547.7%     804.39 ±  0%  sched_debug.cpu.cpu_load[1].avg
>      48.17 ± 72%      -8.0%      44.33 ± 60%   +1832.6%     930.88 ±  2%   +1837.6%     933.29 ±  1%  sched_debug.cpu.cpu_load[1].max
>      12.04 ± 16%     +15.5%      13.90 ± 11%   +4315.6%     531.71 ± 19%   +5030.8%     617.83 ±  4%  sched_debug.cpu.cpu_load[1].min
>       5.37 ± 86%     -16.0%       4.51 ± 82%   +1297.0%      75.04 ± 37%    +890.6%      53.21 ± 14%  sched_debug.cpu.cpu_load[1].stddev
>      17.22 ±  3%      -0.2%      17.19 ±  1%   +4482.9%     788.99 ±  2%   +4559.4%     802.18 ±  0%  sched_debug.cpu.cpu_load[2].avg
>      40.29 ± 36%      -4.6%      38.43 ± 32%   +2179.5%     918.46 ±  1%   +2210.0%     930.75 ±  1%  sched_debug.cpu.cpu_load[2].max
>      12.25 ± 16%     +13.1%      13.86 ± 10%   +4163.6%     522.29 ± 21%   +4879.3%     609.96 ±  5%  sched_debug.cpu.cpu_load[2].min
>       4.29 ± 45%     -13.7%       3.70 ± 44%   +1627.3%      74.02 ± 36%   +1125.0%      52.50 ± 13%  sched_debug.cpu.cpu_load[2].stddev
>      17.16 ±  2%      -0.2%      17.13 ±  1%   +4483.1%     786.38 ±  2%   +4563.1%     800.09 ±  0%  sched_debug.cpu.cpu_load[3].avg
>      36.12 ± 14%      -3.7%      34.79 ± 15%   +2413.4%     907.96 ±  2%   +2461.4%     925.29 ±  1%  sched_debug.cpu.cpu_load[3].max
>      12.38 ± 15%     +12.7%      13.95 ±  9%   +3985.2%     505.54 ± 25%   +4706.1%     594.75 ±  4%  sched_debug.cpu.cpu_load[3].min
>       3.72 ± 21%     -14.2%       3.19 ± 22%   +1887.6%      73.94 ± 37%   +1343.2%      53.69 ± 10%  sched_debug.cpu.cpu_load[3].stddev
>      17.12 ±  1%      -0.1%      17.10 ±  1%   +4478.8%     783.84 ±  2%   +4561.4%     797.98 ±  0%  sched_debug.cpu.cpu_load[4].avg
>      33.42 ±  2%      -2.6%      32.55 ±  7%   +2573.3%     893.33 ±  3%   +2633.2%     913.33 ±  1%  sched_debug.cpu.cpu_load[4].max
>      12.88 ± 12%     +10.4%      14.21 ±  7%   +3701.3%     489.42 ± 26%   +4365.7%     574.96 ±  6%  sched_debug.cpu.cpu_load[4].min
>       3.30 ±  5%     -12.2%       2.89 ± 14%   +2147.0%      74.10 ± 38%   +1571.6%      55.12 ±  9%  sched_debug.cpu.cpu_load[4].stddev
>       1722 ± 32%     +14.8%       1977 ± 39%     +50.3%       2588 ± 58%     +49.2%       2570 ± 17%  sched_debug.cpu.curr->pid.min
>      20.57 ±  7%      +5.1%      21.62 ± 27%  +4.1e+06%     835527 ±  2%  +4.1e+06%     847625 ±  0%  sched_debug.cpu.load.avg
>     169.12 ± 41%     +15.5%     195.38 ±117%  +6.1e+05%    1027133 ±  0%  +7.1e+05%    1200175 ± 14%  sched_debug.cpu.load.max
>      10.88 ± 25%     +13.2%      12.31 ± 12%  +4.6e+06%     500134 ± 62%  +5.8e+06%     625582 ± 11%  sched_debug.cpu.load.min
>      23.05 ± 44%     +17.7%      27.14 ±122%  +4.9e+05%     113672 ± 31%  +4.4e+05%     102223 ± 24%  sched_debug.cpu.load.stddev
>       0.00 ±  2%      +2.4%       0.00 ±  1%     +31.6%       0.00 ± 14%     +20.6%       0.00 ± 17%  sched_debug.cpu.next_balance.stddev
>       1623 ±  9%      +4.1%       1689 ±  8%     +74.5%       2831 ±  3%     +73.9%       2823 ±  7%  sched_debug.cpu.nr_load_updates.stddev
>     159639 ±  1%     -11.5%     141259 ±  8%     -17.0%     132534 ±  1%     -15.7%     134652 ±  1%  sched_debug.cpu.nr_switches.avg
>      11.79 ± 15%      +9.0%      12.86 ± 25%    +268.6%      43.46 ± 18%    +273.9%      44.08 ± 11%  sched_debug.cpu.nr_uninterruptible.max
>     -16.00 ±-13%      -5.2%     -15.17 ±-22%    +337.5%     -70.00 ±-10%    +336.7%     -69.88 ±-26%  sched_debug.cpu.nr_uninterruptible.min
>       5.10 ±  9%      -1.0%       5.05 ±  8%    +414.2%      26.25 ± 16%    +399.5%      25.50 ± 10%  sched_debug.cpu.nr_uninterruptible.stddev
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.01 ±139%      +Inf%       6.56 ± 22%  perf-profile.cycles-pp.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate
>       0.00 ± -1%      +Inf%       5.76 ±172%      +Inf%       5.79 ±122%      +Inf%      16.45 ± 16%  perf-profile.cycles-pp.__do_page_fault.do_page_fault.page_fault
>       0.00 ± -1%      +Inf%       1.58 ±162%      +Inf%       4.57 ±139%      +Inf%       3.30 ± 19%  perf-profile.cycles-pp.__hrtimer_run_queues.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
>       0.00 ± -1%      +Inf%       0.93 ±158%      +Inf%       1.05 ±102%      +Inf%       3.13 ± 16%  perf-profile.cycles-pp.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency
>       0.00 ± -1%      +Inf%       0.40 ±159%      +Inf%       0.57 ±104%      +Inf%       1.18 ±  6%  perf-profile.cycles-pp.__schedule.schedule.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       0.59 ±159%      +Inf%       0.83 ±100%      +Inf%       2.00 ± 23%  perf-profile.cycles-pp.__schedule.schedule.pipe_wait.pipe_write.__vfs_write
>       0.00 ± -1%      +Inf%       6.44 ±159%      +Inf%       8.27 ±108%      +Inf%      19.49 ± 11%  perf-profile.cycles-pp.__vfs_read.vfs_read.sys_read.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       3.43 ±158%      +Inf%       4.53 ±100%      +Inf%      11.07 ± 16%  perf-profile.cycles-pp.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       3.18 ±158%      +Inf%       3.39 ±102%      +Inf%       9.81 ± 22%  perf-profile.cycles-pp.__wake_up_common.__wake_up_sync_key.pipe_read.__vfs_read.vfs_read
>       0.00 ± -1%      +Inf%       3.24 ±158%      +Inf%       3.44 ±102%      +Inf%      10.05 ± 22%  perf-profile.cycles-pp.__wake_up_sync_key.pipe_read.__vfs_read.vfs_read.sys_read
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.23 ±141%      +Inf%       8.04 ± 21%  perf-profile.cycles-pp.activate_task.ttwu_do_activate.try_to_wake_up.default_wake_function.autoremove_wake_function
>       0.00 ± -1%      +Inf%       0.23 ±166%      +Inf%       0.34 ±104%      +Inf%       0.81 ± 27%  perf-profile.cycles-pp.anon_pipe_buf_release.__vfs_read.vfs_read.sys_read.entry_SYSCALL_64_fastpath
>       0.02 ±  0%  +10478.6%       2.12 ±162%  +31737.5%       6.37 ±138%  +22075.0%       4.43 ± 18%  perf-profile.cycles-pp.apic_timer_interrupt
>       0.00 ± -1%      +Inf%       3.14 ±158%      +Inf%       3.38 ±102%      +Inf%       9.79 ± 22%  perf-profile.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.pipe_read.__vfs_read
>       0.00 ± -1%      +Inf%       0.24 ±159%      +Inf%       0.51 ±100%      +Inf%       1.00 ± 24%  perf-profile.cycles-pp.bit_cursor.fb_flashcursor.process_one_work.worker_thread.kthread
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       2.84 ±153%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.call_console_drivers.constprop.23.console_unlock.vprintk_emit.vprintk_default.printk
>      27.10 ± 20%     -53.0%      12.73 ±105%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.call_cpuidle
>       0.00 ± -1%      +Inf%       4.85 ±181%      +Inf%      15.07 ±118%      +Inf%      29.07 ±  9%  perf-profile.cycles-pp.call_cpuidle.cpu_startup_entry.start_secondary
>       0.02 ± 19%   +5855.6%       1.34 ±175%   +5722.2%       1.31 ±107%  +14988.9%       3.40 ±  9%  perf-profile.cycles-pp.call_function_interrupt
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.84 ±153%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.console_unlock.vprintk_emit.vprintk_default.printk.perf_duration_warn
>       0.00 ± -1%      +Inf%       2.23 ±182%      +Inf%       2.46 ±131%      +Inf%       6.55 ± 31%  perf-profile.cycles-pp.copy_page.migrate_misplaced_transhuge_page.do_huge_pmd_numa_page.handle_mm_fault.__do_page_fault
>       0.00 ± -1%      +Inf%       2.42 ±159%      +Inf%       3.04 ±100%      +Inf%       7.95 ± 16%  perf-profile.cycles-pp.copy_page_from_iter.pipe_write.__vfs_write.vfs_write.sys_write
>       0.00 ± -1%      +Inf%       0.35 ±160%      +Inf%       0.49 ±104%      +Inf%       0.98 ± 19%  perf-profile.cycles-pp.copy_page_from_iter_iovec.copy_page_from_iter.pipe_write.__vfs_write.vfs_write
>       0.00 ± -1%      +Inf%       2.25 ±163%      +Inf%       3.46 ±114%      +Inf%       6.78 ± 17%  perf-profile.cycles-pp.copy_page_to_iter.pipe_read.__vfs_read.vfs_read.sys_read
>       0.00 ± -1%      +Inf%       2.04 ±159%      +Inf%       2.55 ±101%      +Inf%       6.82 ± 16%  perf-profile.cycles-pp.copy_user_enhanced_fast_string.copy_page_from_iter.pipe_write.__vfs_write.vfs_write
>       0.00 ± -1%      +Inf%       2.02 ±160%      +Inf%       3.12 ±110%      +Inf%       6.31 ± 15%  perf-profile.cycles-pp.copy_user_enhanced_fast_string.copy_page_to_iter.pipe_read.__vfs_read.vfs_read
>      28.68 ± 21%     -55.2%      12.84 ±105%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.cpu_startup_entry
>       0.00 ± -1%      +Inf%       4.91 ±181%      +Inf%      15.14 ±118%      +Inf%      29.36 ±  8%  perf-profile.cycles-pp.cpu_startup_entry.start_secondary
>      27.10 ± 20%     -53.0%      12.73 ±105%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.cpuidle_enter
>       0.00 ± -1%      +Inf%       4.85 ±181%      +Inf%      15.07 ±118%      +Inf%      29.07 ±  9%  perf-profile.cycles-pp.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
>      26.95 ± 20%     -53.2%      12.62 ±105%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.cpuidle_enter_state
>       0.00 ± -1%      +Inf%       4.79 ±181%      +Inf%      15.03 ±118%      +Inf%      28.79 ±  9%  perf-profile.cycles-pp.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry.start_secondary
>       0.00 ± -1%      +Inf%       0.34 ±158%      +Inf%       0.54 ±103%      +Inf%       1.14 ± 28%  perf-profile.cycles-pp.deactivate_task.__schedule.schedule.pipe_wait.pipe_write
>       0.00 ± -1%      +Inf%       3.12 ±158%      +Inf%       3.37 ±102%      +Inf%       9.75 ± 22%  perf-profile.cycles-pp.default_wake_function.autoremove_wake_function.__wake_up_common.__wake_up_sync_key.pipe_read
>       0.00 ± -1%      +Inf%       0.29 ±159%      +Inf%       0.41 ±100%      +Inf%       0.97 ± 30%  perf-profile.cycles-pp.dequeue_task_fair.deactivate_task.__schedule.schedule.pipe_wait
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.16 ±173%      +Inf%       0.88 ± 39%  perf-profile.cycles-pp.do_execveat_common.isra.34.sys_execve.do_syscall_64.return_from_SYSCALL_64.execve
>       0.00 ± -1%      +Inf%       2.92 ±179%      +Inf%       3.08 ±131%      +Inf%       8.56 ± 30%  perf-profile.cycles-pp.do_huge_pmd_numa_page.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>       0.00 ± -1%      +Inf%       5.78 ±172%      +Inf%       5.81 ±122%      +Inf%      16.48 ± 16%  perf-profile.cycles-pp.do_page_fault.page_fault
>       0.00 ± -1%      +Inf%       0.32 ±165%      +Inf%       0.17 ±173%      +Inf%       0.89 ± 39%  perf-profile.cycles-pp.do_syscall_64.return_from_SYSCALL_64.execve
>       0.00 ± -1%      +Inf%       1.78 ±158%      +Inf%       0.96 ±141%      +Inf%       6.24 ± 22%  perf-profile.cycles-pp.dump_trace.save_stack_trace_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.14 ±140%      +Inf%       7.53 ± 22%  perf-profile.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.21 ±141%      +Inf%       7.86 ± 21%  perf-profile.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.try_to_wake_up.default_wake_function
>       0.02 ±  0%  +55407.1%      11.10 ±157%  +71087.5%      14.24 ±103%  +1.7e+05%      34.11 ± 11%  perf-profile.cycles-pp.entry_SYSCALL_64_fastpath
>       0.03 ± 47%   +1120.8%       0.34 ±155%    +509.1%       0.17 ±173%   +3127.3%       0.89 ± 39%  perf-profile.cycles-pp.execve
>       0.00 ± -1%      +Inf%       0.47 ±160%      +Inf%       0.64 ±104%      +Inf%       1.35 ±  4%  perf-profile.cycles-pp.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       0.24 ±159%      +Inf%       0.56 ±101%      +Inf%       1.20 ± 28%  perf-profile.cycles-pp.fb_flashcursor.process_one_work.worker_thread.kthread.ret_from_fork
>       0.00 ± -1%      +Inf%       0.94 ±179%      +Inf%       1.05 ±105%      +Inf%       2.70 ± 11%  perf-profile.cycles-pp.flush_smp_call_function_queue.generic_smp_call_function_single_interrupt.smp_call_function_interrupt.call_function_interrupt
>       0.00 ± -1%      +Inf%       0.39 ±180%      +Inf%       0.39 ±107%      +Inf%       1.25 ± 13%  perf-profile.cycles-pp.flush_tlb_func.flush_smp_call_function_queue.generic_smp_call_function_single_interrupt.smp_call_function_interrupt.call_function_interrupt
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.95 ±115%      +Inf%       2.86 ± 21%  perf-profile.cycles-pp.flush_tlb_page.ptep_clear_flush.try_to_unmap_one.rmap_walk_anon.rmap_walk
>       0.00 ± -1%      +Inf%       1.12 ±177%      +Inf%       1.06 ±104%      +Inf%       2.79 ± 11%  perf-profile.cycles-pp.generic_smp_call_function_single_interrupt.smp_call_function_interrupt.call_function_interrupt
>       0.00 ± -1%      +Inf%       5.49 ±172%      +Inf%       5.56 ±123%      +Inf%      15.77 ± 17%  perf-profile.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       2.08 ±112%      +Inf%       6.05 ± 16%  perf-profile.cycles-pp.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>       0.00 ± -1%      +Inf%       1.79 ±163%      +Inf%       5.26 ±138%      +Inf%       3.69 ± 19%  perf-profile.cycles-pp.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
>       0.00 ± -1%      +Inf%       0.32 ±159%      +Inf%       0.19 ±173%      +Inf%       0.91 ± 33%  perf-profile.cycles-pp.idle_cpu.select_idle_sibling.select_task_rq_fair.try_to_wake_up.default_wake_function
>      24.41 ± 20%     -50.5%      12.09 ±104%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.intel_idle
>       0.00 ± -1%      +Inf%       3.67 ±234%      +Inf%       4.27 ±165%      +Inf%      28.93 ±  9%  perf-profile.cycles-pp.intel_idle.cpuidle_enter_state.cpuidle_enter.call_cpuidle.cpu_startup_entry
>       0.11 ±100%     -25.2%       0.08 ±244%   +2352.4%       2.57 ±150%    +709.5%       0.85 ± 20%  perf-profile.cycles-pp.irq_work_interrupt
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.57 ±150%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.irq_work_run.smp_irq_work_interrupt.irq_work_interrupt
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.57 ±150%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.irq_work_run_list.irq_work_run.smp_irq_work_interrupt.irq_work_interrupt
>       0.00 ± -1%      +Inf%       0.37 ±160%      +Inf%       0.37 ±103%      +Inf%       1.33 ± 13%  perf-profile.cycles-pp.is_module_text_address.__kernel_text_address.print_context_stack.dump_trace.save_stack_trace_tsk
>       0.00 ± -1%      +Inf%       0.44 ±160%      +Inf%       0.85 ±100%      +Inf%       1.86 ± 14%  perf-profile.cycles-pp.kthread.ret_from_fork
>       0.00 ± -1%      +Inf%       1.83 ±163%      +Inf%       5.35 ±138%      +Inf%       3.82 ± 19%  perf-profile.cycles-pp.local_apic_timer_interrupt.smp_apic_timer_interrupt.apic_timer_interrupt
>       0.00 ± -1%      +Inf%       0.24 ±159%      +Inf%       0.47 ±100%      +Inf%       0.99 ± 25%  perf-profile.cycles-pp.memcpy_erms.mga_imageblit.soft_cursor.bit_cursor.fb_flashcursor
>       0.00 ± -1%      +Inf%       0.24 ±159%      +Inf%       0.51 ±100%      +Inf%       1.00 ± 24%  perf-profile.cycles-pp.mga_imageblit.soft_cursor.bit_cursor.fb_flashcursor.process_one_work
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.51 ±115%      +Inf%       4.33 ± 21%  perf-profile.cycles-pp.migrate_misplaced_page.handle_pte_fault.handle_mm_fault.__do_page_fault.do_page_fault
>       0.00 ± -1%      +Inf%       2.79 ±183%      +Inf%       3.05 ±132%      +Inf%       8.22 ± 30%  perf-profile.cycles-pp.migrate_misplaced_transhuge_page.do_huge_pmd_numa_page.handle_mm_fault.__do_page_fault.do_page_fault
>       0.00 ± -1%      +Inf%       0.28 ±244%      +Inf%       0.33 ±173%      +Inf%       1.07 ± 26%  perf-profile.cycles-pp.migrate_page_copy.migrate_misplaced_transhuge_page.do_huge_pmd_numa_page.handle_mm_fault.__do_page_fault
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.23 ±116%      +Inf%       3.82 ± 20%  perf-profile.cycles-pp.migrate_pages.migrate_misplaced_page.handle_pte_fault.handle_mm_fault.__do_page_fault
>       0.40 ±162%     -77.9%       0.09 ±154%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.mutex_spin_on_owner.isra.4
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.94 ±116%      +Inf%       2.82 ± 20%  perf-profile.cycles-pp.native_flush_tlb_others.flush_tlb_page.ptep_clear_flush.try_to_unmap_one.rmap_walk_anon
>       0.01 ±100%   +4914.3%       0.50 ±162%   +7425.0%       0.75 ±126%  +12500.0%       1.26 ± 17%  perf-profile.cycles-pp.native_irq_return_iret
>       0.00 ± -1%      +Inf%       0.44 ±188%      +Inf%       0.37 ±173%      +Inf%       1.20 ± 22%  perf-profile.cycles-pp.native_send_call_func_ipi.smp_call_function_many.native_flush_tlb_others.flush_tlb_page.ptep_clear_flush
>       0.02 ± 19%  +25709.5%       5.81 ±171%  +25733.3%       5.81 ±122%  +73266.7%      16.51 ± 16%  perf-profile.cycles-pp.page_fault
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.57 ±150%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.perf_duration_warn.irq_work_run_list.irq_work_run.smp_irq_work_interrupt.irq_work_interrupt
>       0.50 ±104%      -6.2%       0.47 ±157%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.pipe_read
>       0.00 ± -1%      +Inf%       6.10 ±159%      +Inf%       7.73 ±109%      +Inf%      18.29 ± 11%  perf-profile.cycles-pp.pipe_read.__vfs_read.vfs_read.sys_read.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       0.80 ±158%      +Inf%       1.03 ±100%      +Inf%       2.40 ± 24%  perf-profile.cycles-pp.pipe_wait.pipe_write.__vfs_write.vfs_write.sys_write
>       0.00 ± -1%      +Inf%       3.41 ±158%      +Inf%       2.56 ±147%      +Inf%      11.80 ± 18%  perf-profile.cycles-pp.pipe_write.__vfs_write.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>       2.49 ± 40%     -79.0%       0.52 ±151%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.poll_idle
>       0.00 ± -1%      +Inf%       1.69 ±158%      +Inf%       0.87 ±139%      +Inf%       5.77 ± 22%  perf-profile.cycles-pp.print_context_stack.dump_trace.save_stack_trace_tsk.__account_scheduler_latency.enqueue_entity
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.57 ±150%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.printk.perf_duration_warn.irq_work_run_list.irq_work_run.smp_irq_work_interrupt
>       0.00 ± -1%      +Inf%       0.25 ±160%      +Inf%       0.60 ±100%      +Inf%       1.21 ± 27%  perf-profile.cycles-pp.process_one_work.worker_thread.kthread.ret_from_fork
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.95 ±115%      +Inf%       2.87 ± 21%  perf-profile.cycles-pp.ptep_clear_flush.try_to_unmap_one.rmap_walk_anon.rmap_walk.try_to_unmap
>       0.02 ±  0%    +807.1%       0.18 ±209%   +1700.0%       0.36 ±102%   +4900.0%       1.00 ± 29%  perf-profile.cycles-pp.read
>       1.11 ± 61%     -97.6%       0.03 ±216%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.rest_init
>       0.02 ± 24%   +2479.6%       0.45 ±153%   +4742.9%       0.85 ±100%  +10542.9%       1.86 ± 14%  perf-profile.cycles-pp.ret_from_fork
>       0.00 ± -1%      +Inf%       0.32 ±165%      +Inf%       0.17 ±173%      +Inf%       0.89 ± 39%  perf-profile.cycles-pp.return_from_SYSCALL_64.execve
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.98 ±116%      +Inf%       2.97 ± 23%  perf-profile.cycles-pp.rmap_walk.try_to_unmap.migrate_pages.migrate_misplaced_page.handle_pte_fault
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.97 ±115%      +Inf%       2.97 ± 23%  perf-profile.cycles-pp.rmap_walk_anon.rmap_walk.try_to_unmap.migrate_pages.migrate_misplaced_page
>       0.00 ± -1%      +Inf%       1.79 ±158%      +Inf%       0.97 ±139%      +Inf%       6.27 ± 22%  perf-profile.cycles-pp.save_stack_trace_tsk.__account_scheduler_latency.enqueue_entity.enqueue_task_fair.activate_task
>       0.00 ± -1%      +Inf%       0.43 ±159%      +Inf%       0.61 ±103%      +Inf%       1.27 ±  3%  perf-profile.cycles-pp.schedule.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       0.60 ±159%      +Inf%       0.88 ±100%      +Inf%       2.06 ± 22%  perf-profile.cycles-pp.schedule.pipe_wait.pipe_write.__vfs_write.vfs_write
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       2.38 ±136%      +Inf%       1.74 ± 20%  perf-profile.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.isra.17.tick_sched_timer.__hrtimer_run_queues
>       0.00 ± -1%      +Inf%       0.42 ±159%      +Inf%       0.42 ±113%      +Inf%       1.28 ± 28%  perf-profile.cycles-pp.select_idle_sibling.select_task_rq_fair.try_to_wake_up.default_wake_function.autoremove_wake_function
>       0.00 ± -1%      +Inf%       0.51 ±158%      +Inf%       0.53 ±109%      +Inf%       1.54 ± 27%  perf-profile.cycles-pp.select_task_rq_fair.try_to_wake_up.default_wake_function.autoremove_wake_function.__wake_up_common
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       2.69 ±152%      +Inf%       0.84 ± 19%  perf-profile.cycles-pp.serial8250_console_putchar.uart_console_write.serial8250_console_write.univ8250_console_write.call_console_drivers.constprop.23
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       2.69 ±152%      +Inf%       0.84 ± 19%  perf-profile.cycles-pp.serial8250_console_write.univ8250_console_write.call_console_drivers.constprop.23.console_unlock.vprintk_emit
>       0.00 ± -1%      +Inf%       2.04 ±163%      +Inf%       6.27 ±138%      +Inf%       4.34 ± 18%  perf-profile.cycles-pp.smp_apic_timer_interrupt.apic_timer_interrupt
>       0.00 ± -1%      +Inf%       1.28 ±178%      +Inf%       1.21 ±105%      +Inf%       3.17 ±  8%  perf-profile.cycles-pp.smp_call_function_interrupt.call_function_interrupt
>       0.00 ± -1%      +Inf%       0.93 ±177%      +Inf%       0.94 ±116%      +Inf%       2.80 ± 19%  perf-profile.cycles-pp.smp_call_function_many.native_flush_tlb_others.flush_tlb_page.ptep_clear_flush.try_to_unmap_one
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.57 ±150%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.smp_irq_work_interrupt.irq_work_interrupt
>       0.00 ± -1%      +Inf%       0.24 ±159%      +Inf%       0.51 ±100%      +Inf%       1.00 ± 24%  perf-profile.cycles-pp.soft_cursor.bit_cursor.fb_flashcursor.process_one_work.worker_thread
>       1.11 ± 61%     -97.6%       0.03 ±216%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.start_kernel
>       0.00 ± -1%      +Inf%       0.32 ±165%      +Inf%       0.17 ±173%      +Inf%       0.89 ± 39%  perf-profile.cycles-pp.sys_execve.do_syscall_64.return_from_SYSCALL_64.execve
>       0.00 ± -1%      +Inf%       6.82 ±159%      +Inf%       8.52 ±108%      +Inf%      20.43 ± 11%  perf-profile.cycles-pp.sys_read.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       3.45 ±158%      +Inf%       4.57 ±100%      +Inf%      11.18 ± 16%  perf-profile.cycles-pp.sys_write.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       0.49 ±159%      +Inf%       0.64 ±103%      +Inf%       1.44 ±  5%  perf-profile.cycles-pp.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.70 ±141%      +Inf%       1.16 ± 25%  perf-profile.cycles-pp.task_tick_fair.scheduler_tick.update_process_times.tick_sched_handle.isra.17.tick_sched_timer
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       3.39 ±134%      +Inf%       2.78 ± 22%  perf-profile.cycles-pp.tick_sched_handle.isra.17.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.local_apic_timer_interrupt
>       0.00 ± -1%      +Inf%       1.35 ±162%      +Inf%       3.59 ±135%      +Inf%       2.89 ± 22%  perf-profile.cycles-pp.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt.local_apic_timer_interrupt.smp_apic_timer_interrupt
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.98 ±116%      +Inf%       3.00 ± 23%  perf-profile.cycles-pp.try_to_unmap.migrate_pages.migrate_misplaced_page.handle_pte_fault.handle_mm_fault
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       0.95 ±115%      +Inf%       2.94 ± 23%  perf-profile.cycles-pp.try_to_unmap_one.rmap_walk_anon.rmap_walk.try_to_unmap.migrate_pages
>       0.00 ± -1%      +Inf%       3.11 ±158%      +Inf%       1.66 ±143%      +Inf%      10.32 ± 21%  perf-profile.cycles-pp.try_to_wake_up.default_wake_function.autoremove_wake_function.__wake_up_common.__wake_up_sync_key
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       1.33 ±143%      +Inf%       8.29 ± 21%  perf-profile.cycles-pp.ttwu_do_activate.try_to_wake_up.default_wake_function.autoremove_wake_function.__wake_up_common
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       2.69 ±152%      +Inf%       0.84 ± 19%  perf-profile.cycles-pp.uart_console_write.serial8250_console_write.univ8250_console_write.call_console_drivers.constprop.23.console_unlock
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       2.69 ±152%      +Inf%       0.84 ± 19%  perf-profile.cycles-pp.univ8250_console_write.call_console_drivers.constprop.23.console_unlock.vprintk_emit.vprintk_default
>       0.00 ± -1%      +NaN%       0.00 ± -1%      +Inf%       3.30 ±133%      +Inf%       2.70 ± 22%  perf-profile.cycles-pp.update_process_times.tick_sched_handle.isra.17.tick_sched_timer.__hrtimer_run_queues.hrtimer_interrupt
>       0.00 ± -1%      +Inf%       6.74 ±159%      +Inf%       8.48 ±108%      +Inf%      20.19 ± 11%  perf-profile.cycles-pp.vfs_read.sys_read.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       3.45 ±158%      +Inf%       4.55 ±100%      +Inf%      11.15 ± 16%  perf-profile.cycles-pp.vfs_write.sys_write.entry_SYSCALL_64_fastpath
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.84 ±153%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.vprintk_default.printk.perf_duration_warn.irq_work_run_list.irq_work_run
>       0.00 ± -1%      +Inf%       0.08 ±244%      +Inf%       2.84 ±153%      +Inf%       0.85 ± 20%  perf-profile.cycles-pp.vprintk_emit.vprintk_default.printk.perf_duration_warn.irq_work_run_list
>       0.00 ± -1%      +Inf%       0.07 ±244%      +Inf%       2.59 ±152%      +Inf%       0.81 ± 19%  perf-profile.cycles-pp.wait_for_xmitr.serial8250_console_putchar.uart_console_write.serial8250_console_write.univ8250_console_write
>       0.00 ± -1%      +Inf%       0.25 ±160%      +Inf%       0.61 ±100%      +Inf%       1.23 ± 26%  perf-profile.cycles-pp.worker_thread.kthread.ret_from_fork
>       1.11 ± 61%     -85.6%       0.16 ±199%     -87.8%       0.14 ±173%     -85.8%       0.16 ±173%  perf-profile.cycles-pp.x86_64_start_kernel
>       1.11 ± 61%     -97.6%       0.03 ±216%    -100.0%       0.00 ± -1%    -100.0%       0.00 ± -1%  perf-profile.cycles-pp.x86_64_start_reservations

The main increases that stick out to me are in the read() from the
pipe (both in the copy as well as increased wakeups to the writer),
and increased NUMA balancing activity (page faults and migrations).

If NUMA balancing doesn't settle, the NUMA page faults can reduce
throughput of the single ruby script that writes input data to the
pixz pipe, which in turn will no longer saturate the 48 compression
threads and so explain the increased incidence of hitting an empty
pipe and issuing a wakeup.

But why would NUMA balancing be impacted after this patch? The only
place where NUMA balancing uses the watermark directly is to determine
whether it can migrate toward a specific node (and indirectly during
allocation of a huge page). But your NUMA nodes shouldn't be nearly
full: when I run pixz with 48 threads, it consumes ~600MB of memory.
Your nodes have 33G each. Surely, it should always find the free
memory to be plenty, even after the patch raised the watermarks? A
difference in THP success rate isn't indicated in stats, either.

To be sure, this is a minimal test system with nothing else running,
right?

Could you please collect periodic snapshots of /proc/zoneinfo while
the pixz test is running? Something like this would be great:

while sleep 1; do cat /proc/zoneinfo >> zoneinfo.log; done

(both on last good and first bad)

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ