lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZdLucnSQfjU/dFRi@xsang-OptiPlex-9020>
Date: Mon, 19 Feb 2024 14:00:18 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Chuck Lever <chuck.lever@...cle.com>
CC: Jan Kara <jack@...e.cz>, Chuck Lever <cel@...nel.org>,
	<viro@...iv.linux.org.uk>, <brauner@...nel.org>, <hughd@...gle.com>,
	<akpm@...ux-foundation.org>, <Liam.Howlett@...cle.com>,
	<feng.tang@...el.com>, <linux-kernel@...r.kernel.org>,
	<linux-fsdevel@...r.kernel.org>, <maple-tree@...ts.infradead.org>,
	<linux-mm@...ck.org>, <lkp@...el.com>, <oliver.sang@...el.com>
Subject: Re: [PATCH RFC 6/7] libfs: Convert simple directory offsets to use a
 Maple Tree

hi, Chuck Lever,

On Sun, Feb 18, 2024 at 10:57:07AM -0500, Chuck Lever wrote:
> On Sun, Feb 18, 2024 at 10:02:37AM +0800, Oliver Sang wrote:
> > hi, Chuck Lever,
> > 
> > On Thu, Feb 15, 2024 at 08:45:33AM -0500, Chuck Lever wrote:
> > > On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> > > > On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > > > > From: Chuck Lever <chuck.lever@...cle.com>
> > > > > 
> > > > > Test robot reports:
> > > > > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > > > > >
> > > > > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > > 
> > > > > Feng Tang further clarifies that:
> > > > > > ... the new simple_offset_add()
> > > > > > called by shmem_mknod() brings extra cost related with slab,
> > > > > > specifically the 'radix_tree_node', which cause the regression.
> > > > > 
> > > > > Willy's analysis is that, over time, the test workload causes
> > > > > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> > > > > 
> > > > > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > > > > hope that Maple Tree's dense node mode will handle this scenario
> > > > > more scalably.
> > > > > 
> > > > > In addition, we can widen the directory offset to an unsigned long
> > > > > everywhere.
> > > > > 
> > > > > Suggested-by: Matthew Wilcox <willy@...radead.org>
> > > > > Reported-by: kernel test robot <oliver.sang@...el.com>
> > > > > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > > > > Signed-off-by: Chuck Lever <chuck.lever@...cle.com>
> > > > 
> > > > OK, but this will need the performance numbers.
> > > 
> > > Yes, I totally concur. The point of this posting was to get some
> > > early review and start the ball rolling.
> > > 
> > > Actually we expect roughly the same performance numbers now. "Dense
> > > node" support in Maple Tree is supposed to be the real win, but
> > > I'm not sure it's ready yet.
> > > 
> > > 
> > > > Otherwise we have no idea
> > > > whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
> > > > 0-day guys are quite helpful.
> > > 
> > > Oliver and Feng were copied on this series.
> > 
> > we are in holidays last week, now we are back.
> > 
> > I noticed there is v2 for this patch set
> > https://lore.kernel.org/all/170820145616.6328.12620992971699079156.stgit@91.116.238.104.host.secureserver.net/
> > 
> > and you also put it in a branch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
> > "simple-offset-maple" branch.
> > 
> > we will test aim9 performance based on this branch. Thanks
> 
> Very much appreciated!

always our pleasure!

we've already sent out a report [1] to you for the commit a616bc6667 in new
branch.
we saw 11.8% improvement of aim9.disk_src.ops_per_sec on it comparing to its
parent.

so the regression we saw on a2e459555c is 'half' recovered.

since I noticed the performance for a2e459555c, v6.8-rc4 and f3f24869a1 (parent
of a616bc6667) are very similar, so ignored results from v6.8-rc4 and f3f24869a1
in below tables for brief. if you want a full table, please let me know. Thanks!


summary for aim9.disk_src.ops_per_sec:

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/disk_src/aim9/300s

commit:
  23a31d8764 ("shmem: Refactor shmem_symlink()")
  a2e459555c ("shmem: stable directory offsets")
  a616bc6667 ("libfs: Convert simple directory offsets to use a Maple Tree")


23a31d87645c6527 a2e459555c5f9da3e619b7e47a6 a616bc666748063733c62e15ea4
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
    202424           -19.0%     163868            -9.3%     183678        aim9.disk_src.ops_per_sec


full data:

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/disk_src/aim9/300s

commit:
  23a31d8764 ("shmem: Refactor shmem_symlink()")
  a2e459555c ("shmem: stable directory offsets")
  a616bc6667 ("libfs: Convert simple directory offsets to use a Maple Tree")


23a31d87645c6527 a2e459555c5f9da3e619b7e47a6 a616bc666748063733c62e15ea4
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
      1404            +0.6%       1412            +3.3%       1450        boot-time.idle
      0.26 ±  9%      +0.1        0.36 ±  2%      -0.1        0.20 ±  4%  mpstat.cpu.all.soft%
      0.61            -0.1        0.52            -0.0        0.58        mpstat.cpu.all.usr%
      1.00            +0.0%       1.00           +19.5%       1.19 ±  2%  vmstat.procs.r
      1525 ±  6%      -5.6%       1440            +9.4%       1668 ±  4%  vmstat.system.cs
     52670            +5.0%      55323 ±  3%     +12.7%      59381 ±  3%  turbostat.C1
      0.02            +0.0        0.02            +0.0        0.03 ± 10%  turbostat.C1%
     12949 ±  9%      -1.2%      12795 ±  6%     -19.6%      10412 ±  6%  turbostat.POLL
    115468 ± 24%     +11.9%     129258 ±  7%     +16.5%     134545 ±  5%  numa-meminfo.node0.AnonPages.max
      2015 ± 12%      -7.5%       1864 ±  5%     +30.3%       2624 ± 10%  numa-meminfo.node0.PageTables
      4795 ± 30%      +1.1%       4846 ± 37%    +117.0%      10405 ± 21%  numa-meminfo.node0.Shmem
      6442 ±  5%      -0.6%       6401 ±  4%     -19.6%       5180 ±  7%  numa-meminfo.node1.KernelStack
     13731 ± 92%     -88.7%       1546 ±  6%    +161.6%      35915 ± 23%  time.involuntary_context_switches
     94.83            -4.2%      90.83            +1.2%      96.00        time.percent_of_cpu_this_job_got
    211.64            +0.5%     212.70            +4.0%     220.06        time.system_time
     73.62           -17.6%      60.69            -6.4%      68.94        time.user_time
    202424           -19.0%     163868            -9.3%     183678        aim9.disk_src.ops_per_sec
     13731 ± 92%     -88.7%       1546 ±  6%    +161.6%      35915 ± 23%  aim9.time.involuntary_context_switches
     94.83            -4.2%      90.83            +1.2%      96.00        aim9.time.percent_of_cpu_this_job_got
    211.64            +0.5%     212.70            +4.0%     220.06        aim9.time.system_time
     73.62           -17.6%      60.69            -6.4%      68.94        aim9.time.user_time
    174558 ±  4%      -1.0%     172852 ±  7%     -19.7%     140084 ±  7%  meminfo.DirectMap4k
     94166            +6.6%     100388           -14.4%      80579        meminfo.KReclaimable
     12941            +0.4%      12989           -14.4%      11078        meminfo.KernelStack
      3769            +0.2%       3775           +31.5%       4955        meminfo.PageTables
     79298            +0.0%      79298           -70.6%      23298        meminfo.Percpu
     94166            +6.6%     100388           -14.4%      80579        meminfo.SReclaimable
    204209            +2.9%     210111            -9.6%     184661        meminfo.Slab
    503.33 ± 12%      -7.5%     465.67 ±  5%     +30.4%     656.39 ± 10%  numa-vmstat.node0.nr_page_table_pages
      1198 ± 30%      +1.1%       1211 ± 37%    +117.0%       2601 ± 21%  numa-vmstat.node0.nr_shmem
    220.33            +0.2%     220.83 ±  2%     -97.3%       6.00 ±100%  numa-vmstat.node1.nr_active_file
    167.00            +0.2%     167.33 ±  2%     -87.7%      20.48 ±100%  numa-vmstat.node1.nr_inactive_file
      6443 ±  5%      -0.6%       6405 ±  4%     -19.5%       5184 ±  7%  numa-vmstat.node1.nr_kernel_stack
    220.33            +0.2%     220.83 ±  2%     -97.3%       6.00 ±100%  numa-vmstat.node1.nr_zone_active_file
    167.00            +0.2%     167.33 ±  2%     -87.7%      20.48 ±100%  numa-vmstat.node1.nr_zone_inactive_file
      0.04 ± 25%      +4.1%       0.05 ± 33%     -40.1%       0.03 ± 34%  perf-sched.sch_delay.avg.ms.syslog_print.do_syslog.kmsg_read.vfs_read
      0.01 ±  4%      +1.6%       0.01 ±  8%    +136.9%       0.02 ± 29%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.16 ± 10%     -18.9%       0.13 ± 12%     -16.4%       0.13 ± 15%  perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.14 ± 15%      -6.0%       0.14 ± 20%     -30.4%       0.10 ± 26%  perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.04 ±  7%   +1802.4%       0.78 ±115%   +7258.6%       3.00 ± 56%  perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.05 ± 40%     -14.5%       0.04 ± 31%   +6155.4%       3.09        perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.01 ±  8%     +25.5%       0.01 ± 21%    +117.0%       0.02 ± 30%  perf-sched.total_sch_delay.average.ms
      0.16 ± 11%   +1204.9%       2.12 ± 90%   +2225.8%       3.77 ± 10%  perf-sched.total_sch_delay.max.ms
     58.50 ± 28%      -4.8%      55.67 ± 14%     -23.9%      44.50 ±  4%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.usleep_range_state.ipmi_thread.kthread
    277.83 ±  3%     +10.3%     306.50 ±  5%     +17.9%     327.62 ±  4%  perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.10 ± 75%     -47.1%       0.05 ± 86%    -100.0%       0.00        perf-sched.wait_time.avg.ms.__cond_resched.dput.path_put.vfs_statx.vfs_fstatat
      0.10 ± 72%     -10.8%       0.09 ± 67%    -100.0%       0.00        perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      2.00 ± 27%      -4.8%       1.90 ± 14%     -23.7%       1.53 ±  4%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select
      0.16 ± 81%     +20.1%       0.19 ±104%    -100.0%       0.00        perf-sched.wait_time.max.ms.__cond_resched.dput.path_put.vfs_statx.vfs_fstatat
      0.22 ± 77%     +14.2%       0.25 ± 54%    -100.0%       0.00        perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      7.32 ± 43%     -10.3%       6.57 ± 20%     -33.2%       4.89 ±  6%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select
    220.33            +0.2%     220.83 ±  2%     -94.6%      12.00        proc-vmstat.nr_active_file
    676766            -0.0%     676706            +4.1%     704377        proc-vmstat.nr_file_pages
    167.00            +0.2%     167.33 ±  2%     -75.5%      40.98        proc-vmstat.nr_inactive_file
     12941            +0.4%      12988           -14.4%      11083        proc-vmstat.nr_kernel_stack
    942.50            +0.1%     943.50           +31.4%       1238        proc-vmstat.nr_page_table_pages
      7915 ±  3%      -0.8%       7855            +8.6%       8599        proc-vmstat.nr_shmem
     23541            +6.5%      25074           -14.4%      20144        proc-vmstat.nr_slab_reclaimable
     27499            -0.3%      27422            -5.4%      26016        proc-vmstat.nr_slab_unreclaimable
    668461            +0.0%     668461            +4.2%     696493        proc-vmstat.nr_unevictable
    220.33            +0.2%     220.83 ±  2%     -94.6%      12.00        proc-vmstat.nr_zone_active_file
    167.00            +0.2%     167.33 ±  2%     -75.5%      40.98        proc-vmstat.nr_zone_inactive_file
    668461            +0.0%     668461            +4.2%     696493        proc-vmstat.nr_zone_unevictable
    722.17 ± 71%     -34.7%     471.67 ±136%     -98.3%      12.62 ±129%  proc-vmstat.numa_hint_faults_local
   1437319 ± 24%    +377.6%    6864201           -47.8%     750673 ±  7%  proc-vmstat.numa_hit
   1387016 ± 25%    +391.4%    6815486           -49.5%     700947 ±  7%  proc-vmstat.numa_local
     50329            -0.0%      50324            -1.2%      49704        proc-vmstat.numa_other
   4864362 ± 34%    +453.6%   26931180           -66.1%    1648373 ± 17%  proc-vmstat.pgalloc_normal
   4835960 ± 34%    +455.4%   26856610           -66.3%    1628178 ± 18%  proc-vmstat.pgfree
     11.21           +23.7%      13.87           -81.8%       2.04        perf-stat.i.MPKI
 7.223e+08            -4.4%  6.907e+08            -3.9%   6.94e+08        perf-stat.i.branch-instructions
      2.67            +0.2        2.88            +0.0        2.70        perf-stat.i.branch-miss-rate%
  19988363            +2.8%   20539702            -3.1%   19363031        perf-stat.i.branch-misses
     17.36            -2.8       14.59            +0.4       17.77        perf-stat.i.cache-miss-rate%
  40733859           +19.5%   48659982            -1.9%   39962840        perf-stat.i.cache-references
      1482 ±  7%      -5.7%       1398            +9.6%       1623 ±  5%  perf-stat.i.context-switches
      1.76            +3.5%       1.82            +5.1%       1.85        perf-stat.i.cpi
     55.21            +5.4%      58.21 ±  2%      +0.4%      55.45        perf-stat.i.cpu-migrations
  16524721 ±  8%      -4.8%   15726404 ±  4%     -13.1%   14367627 ±  4%  perf-stat.i.dTLB-load-misses
  1.01e+09            -3.8%  9.719e+08            -4.4%  9.659e+08        perf-stat.i.dTLB-loads
      0.26 ±  4%      -0.0        0.23 ±  3%      -0.0        0.25 ±  3%  perf-stat.i.dTLB-store-miss-rate%
   2166022 ±  4%      -6.9%    2015917 ±  3%      -7.0%    2014037 ±  3%  perf-stat.i.dTLB-store-misses
 8.503e+08            +5.5%  8.968e+08            -3.5%  8.205e+08        perf-stat.i.dTLB-stores
     69.22 ±  4%      +6.4       75.60           +14.4       83.60 ±  3%  perf-stat.i.iTLB-load-miss-rate%
    709457 ±  5%      -5.4%     670950 ±  2%    +133.6%    1657233        perf-stat.i.iTLB-load-misses
    316455 ± 12%     -31.6%     216531 ±  3%      +3.5%     327592 ± 19%  perf-stat.i.iTLB-loads
 3.722e+09            -3.1%  3.608e+09            -4.6%  3.553e+09        perf-stat.i.instructions
      5243 ±  5%      +2.2%       5357 ±  2%     -59.1%       2142        perf-stat.i.instructions-per-iTLB-miss
      0.57            -3.3%       0.55            -4.8%       0.54        perf-stat.i.ipc
    865.04           -10.4%     775.02 ±  3%      -2.5%     843.12        perf-stat.i.metric.K/sec
     53.84            -0.4%      53.61            -4.0%      51.71        perf-stat.i.metric.M/sec
     47.51            -2.1       45.37            +0.7       48.17        perf-stat.i.node-load-miss-rate%
     88195 ±  3%      +5.2%      92745 ±  4%     +16.4%     102647 ±  6%  perf-stat.i.node-load-misses
    106705 ±  3%     +14.8%     122490 ±  5%     +13.2%     120774 ±  5%  perf-stat.i.node-loads
    107169 ±  4%     +29.0%     138208 ±  7%      +7.5%     115217 ±  5%  perf-stat.i.node-stores
     10.94           +23.3%      13.49           -81.8%       1.99        perf-stat.overall.MPKI
      2.77            +0.2        2.97            +0.0        2.79        perf-stat.overall.branch-miss-rate%
     17.28            -2.7       14.56            +0.4       17.67        perf-stat.overall.cache-miss-rate%
      1.73            +3.4%       1.79            +5.0%       1.82        perf-stat.overall.cpi
      0.25 ±  4%      -0.0        0.22 ±  3%      -0.0        0.24 ±  3%  perf-stat.overall.dTLB-store-miss-rate%
     69.20 ±  4%      +6.4       75.60           +14.4       83.58 ±  3%  perf-stat.overall.iTLB-load-miss-rate%
      5260 ±  5%      +2.3%       5380 ±  2%     -59.2%       2144        perf-stat.overall.instructions-per-iTLB-miss
      0.58            -3.2%       0.56            -4.7%       0.55        perf-stat.overall.ipc
     45.25            -2.2       43.10            +0.7       45.93        perf-stat.overall.node-load-miss-rate%
 7.199e+08            -4.4%  6.883e+08            -3.9%  6.917e+08        perf-stat.ps.branch-instructions
  19919808            +2.8%   20469001            -3.1%   19299968        perf-stat.ps.branch-misses
  40597326           +19.5%   48497201            -1.9%   39829580        perf-stat.ps.cache-references
      1477 ±  7%      -5.7%       1393            +9.6%       1618 ±  5%  perf-stat.ps.context-switches
     55.06            +5.4%      58.03 ±  2%      +0.5%      55.32        perf-stat.ps.cpu-migrations
  16469488 ±  8%      -4.8%   15673772 ±  4%     -13.1%   14319828 ±  4%  perf-stat.ps.dTLB-load-misses
 1.007e+09            -3.8%  9.686e+08            -4.4%  9.627e+08        perf-stat.ps.dTLB-loads
   2158768 ±  4%      -6.9%    2009174 ±  3%      -7.0%    2007326 ±  3%  perf-stat.ps.dTLB-store-misses
 8.475e+08            +5.5%  8.937e+08            -3.5%  8.178e+08        perf-stat.ps.dTLB-stores
    707081 ±  5%      -5.4%     668703 ±  2%    +133.6%    1651678        perf-stat.ps.iTLB-load-misses
    315394 ± 12%     -31.6%     215816 ±  3%      +3.5%     326463 ± 19%  perf-stat.ps.iTLB-loads
  3.71e+09            -3.1%  3.595e+09            -4.5%  3.541e+09        perf-stat.ps.instructions
     87895 ±  3%      +5.2%      92424 ±  4%     +16.4%     102341 ±  6%  perf-stat.ps.node-load-misses
    106351 ±  3%     +14.8%     122083 ±  5%     +13.2%     120439 ±  5%  perf-stat.ps.node-loads
    106728 ±  4%     +29.1%     137740 ±  7%      +7.6%     114824 ±  5%  perf-stat.ps.node-stores
 1.117e+12            -3.0%  1.084e+12            -4.5%  1.067e+12        perf-stat.total.instructions
     10.55 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.avg
    506.30 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.max
      0.00            +0.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.min
      1.08 ±  7%      -7.7%       1.00           -46.2%       0.58 ± 27%  sched_debug.cfs_rq:/.h_nr_running.max
    538959 ± 24%     -23.2%     414090           -58.3%     224671 ± 31%  sched_debug.cfs_rq:/.load.max
    130191 ± 14%     -13.3%     112846 ±  6%     -56.6%      56505 ± 39%  sched_debug.cfs_rq:/.load.stddev
     10.56 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.avg
    506.64 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.max
      0.00            +0.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.min
    116849 ± 27%     -51.2%      56995 ± 20%     -66.7%      38966 ± 39%  sched_debug.cfs_rq:/.min_vruntime.max
      1248 ± 14%      +9.8%       1370 ± 15%     -29.8%     876.86 ± 16%  sched_debug.cfs_rq:/.min_vruntime.min
     20484 ± 22%     -37.0%      12910 ± 15%     -60.7%       8059 ± 32%  sched_debug.cfs_rq:/.min_vruntime.stddev
      1.08 ±  7%      -7.7%       1.00           -46.2%       0.58 ± 27%  sched_debug.cfs_rq:/.nr_running.max
      1223 ±191%    -897.4%      -9754          -100.0%       0.00        sched_debug.cfs_rq:/.spread0.avg
    107969 ± 29%     -65.3%      37448 ± 39%    -100.0%       0.00        sched_debug.cfs_rq:/.spread0.max
     -7628          +138.2%     -18173          -100.0%       0.00        sched_debug.cfs_rq:/.spread0.min
     20484 ± 22%     -37.0%      12910 ± 15%    -100.0%       0.00        sched_debug.cfs_rq:/.spread0.stddev
     29.84 ± 19%      -1.1%      29.52 ± 14%    -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.avg
    569.91 ± 11%      +4.5%     595.58          -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.max
    109.06 ± 13%      +3.0%     112.32 ±  5%    -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.stddev
    910320            +0.0%     910388           -44.2%     507736 ± 28%  sched_debug.cpu.avg_idle.avg
   1052715 ±  5%      -3.8%    1012910           -45.8%     570219 ± 28%  sched_debug.cpu.avg_idle.max
    175426 ±  8%      +2.8%     180422 ±  6%     -42.5%     100839 ± 20%  sched_debug.cpu.clock.avg
    175428 ±  8%      +2.8%     180424 ±  6%     -42.5%     100840 ± 20%  sched_debug.cpu.clock.max
    175424 ±  8%      +2.8%     180421 ±  6%     -42.5%     100838 ± 20%  sched_debug.cpu.clock.min
    170979 ±  8%      +2.8%     175682 ±  6%     -42.5%      98373 ± 20%  sched_debug.cpu.clock_task.avg
    173398 ±  8%      +2.6%     177937 ±  6%     -42.6%      99568 ± 20%  sched_debug.cpu.clock_task.max
    165789 ±  8%      +3.2%     171014 ±  6%     -42.4%      95513 ± 20%  sched_debug.cpu.clock_task.min
      2178 ±  8%     -13.1%       1893 ± 18%     -49.7%       1094 ± 33%  sched_debug.cpu.clock_task.stddev
      7443 ±  4%      +1.7%       7566 ±  3%     -43.0%       4246 ± 24%  sched_debug.cpu.curr->pid.max
      1409 ±  2%      +1.1%       1426 ±  2%     -42.0%     818.46 ± 25%  sched_debug.cpu.curr->pid.stddev
    502082            +0.0%     502206           -43.9%     281834 ± 29%  sched_debug.cpu.max_idle_balance_cost.avg
    567767 ±  5%      -3.3%     548996 ±  2%     -46.6%     303439 ± 27%  sched_debug.cpu.max_idle_balance_cost.max
      4294            +0.0%       4294           -43.7%       2415 ± 29%  sched_debug.cpu.next_balance.avg
      4294            +0.0%       4294           -43.7%       2415 ± 29%  sched_debug.cpu.next_balance.max
      4294            +0.0%       4294           -43.7%       2415 ± 29%  sched_debug.cpu.next_balance.min
      0.29 ±  3%      -1.2%       0.28           -43.0%       0.16 ± 29%  sched_debug.cpu.nr_running.stddev
      8212 ±  7%      -2.2%       8032 ±  4%     -40.7%       4867 ± 20%  sched_debug.cpu.nr_switches.avg
     55209 ± 14%     -21.8%      43154 ± 14%     -57.0%      23755 ± 20%  sched_debug.cpu.nr_switches.max
      1272 ± 23%     +10.2%       1402 ±  8%     -39.1%     775.51 ± 29%  sched_debug.cpu.nr_switches.min
      9805 ± 13%     -13.7%       8459 ±  8%     -50.8%       4825 ± 20%  sched_debug.cpu.nr_switches.stddev
    -15.73           -14.1%     -13.52           -64.8%      -5.53        sched_debug.cpu.nr_uninterruptible.min
      6.08 ± 27%      -1.0%       6.02 ± 13%     -53.7%       2.82 ± 24%  sched_debug.cpu.nr_uninterruptible.stddev
    175425 ±  8%      +2.8%     180421 ±  6%     -42.5%     100838 ± 20%  sched_debug.cpu_clk
 4.295e+09            +0.0%  4.295e+09           -43.7%  2.416e+09 ± 29%  sched_debug.jiffies
    174815 ±  8%      +2.9%     179811 ±  6%     -42.5%     100505 ± 20%  sched_debug.ktime
    175972 ±  8%      +2.8%     180955 ±  6%     -42.5%     101150 ± 20%  sched_debug.sched_clk
  58611259            +0.0%   58611259           -94.0%    3508734 ± 29%  sched_debug.sysctl_sched.sysctl_sched_features
      0.75            +0.0%       0.75          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_idle_min_granularity
     24.00            +0.0%      24.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_latency
      3.00            +0.0%       3.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_min_granularity
      4.00            +0.0%       4.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_wakeup_granularity
      0.26 ±100%      -0.3        0.00            +1.2        1.44 ±  9%  perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
      2.08 ± 26%      -0.2        1.87 ± 12%      -1.4        0.63 ± 11%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
      0.00            +0.0        0.00            +0.7        0.67 ± 11%  perf-profile.calltrace.cycles-pp.rcu_sched_clock_irq.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues
      0.00            +0.0        0.00            +0.7        0.71 ± 18%  perf-profile.calltrace.cycles-pp.rebalance_domains.__do_softirq.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.00            +0.0        0.00            +0.8        0.80 ± 14%  perf-profile.calltrace.cycles-pp.getname_flags.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
      0.00            +0.0        0.00            +0.8        0.84 ± 12%  perf-profile.calltrace.cycles-pp.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
      0.00            +0.0        0.00            +1.0        0.98 ± 13%  perf-profile.calltrace.cycles-pp.mas_alloc_cyclic.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open
      0.00            +0.0        0.00            +1.1        1.10 ± 14%  perf-profile.calltrace.cycles-pp.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups
      0.00            +0.0        0.00            +1.2        1.20 ± 13%  perf-profile.calltrace.cycles-pp.mas_erase.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink
      0.00            +0.0        0.00            +1.3        1.34 ± 15%  perf-profile.calltrace.cycles-pp.link_path_walk.path_lookupat.filename_lookup.vfs_statx.__do_sys_newstat
      0.00            +0.0        0.00            +1.4        1.35 ± 12%  perf-profile.calltrace.cycles-pp.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat
      0.00            +0.0        0.00            +1.6        1.56 ± 18%  perf-profile.calltrace.cycles-pp.__do_softirq.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.00            +0.0        0.00            +1.7        1.73 ±  6%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues
      0.00            +0.0        0.00            +1.8        1.80 ± 16%  perf-profile.calltrace.cycles-pp.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      0.00            +0.0        0.00            +2.0        2.03 ± 15%  perf-profile.calltrace.cycles-pp.path_lookupat.filename_lookup.vfs_statx.__do_sys_newstat.do_syscall_64
      0.00            +0.0        0.00            +2.2        2.16 ± 15%  perf-profile.calltrace.cycles-pp.filename_lookup.vfs_statx.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +0.0        0.00            +2.9        2.94 ± 14%  perf-profile.calltrace.cycles-pp.vfs_statx.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
      0.00            +0.0        0.00            +3.2        3.19 ±  8%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt
      0.00            +0.0        0.00            +3.4        3.35 ±  8%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
      0.00            +0.0        0.00            +3.9        3.88 ±  8%  perf-profile.calltrace.cycles-pp.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
      0.00            +0.8        0.75 ± 12%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__call_rcu_common.xas_store.__xa_erase.xa_erase.simple_offset_remove
      0.00            +0.8        0.78 ± 34%      +0.0        0.00        perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_lru.xas_alloc.xas_create.xas_store
      0.00            +0.8        0.83 ± 29%      +0.0        0.00        perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_lru.xas_alloc.xas_expand
      0.00            +0.9        0.92 ± 26%      +0.0        0.00        perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_lru.xas_alloc.xas_expand.xas_create
      0.00            +1.0        0.99 ± 27%      +0.0        0.00        perf-profile.calltrace.cycles-pp.shuffle_freelist.allocate_slab.___slab_alloc.kmem_cache_alloc_lru.xas_alloc
      0.00            +1.0        1.04 ± 28%      +0.0        0.00        perf-profile.calltrace.cycles-pp.kmem_cache_alloc_lru.xas_alloc.xas_create.xas_store.__xa_alloc
      0.00            +1.1        1.11 ± 26%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_alloc.xas_create.xas_store.__xa_alloc.__xa_alloc_cyclic
      1.51 ± 24%      +1.2        2.73 ± 10%      +1.2        2.75 ± 10%  perf-profile.calltrace.cycles-pp.vfs_unlink.do_unlinkat.__x64_sys_unlink.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.2        1.24 ± 20%      +0.0        0.00        perf-profile.calltrace.cycles-pp.kmem_cache_alloc_lru.xas_alloc.xas_expand.xas_create.xas_store
      0.00            +1.3        1.27 ± 10%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_store.__xa_erase.xa_erase.simple_offset_remove.shmem_unlink
      0.00            +1.3        1.30 ± 10%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__xa_erase.xa_erase.simple_offset_remove.shmem_unlink.vfs_unlink
      0.00            +1.3        1.33 ± 19%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_alloc.xas_expand.xas_create.xas_store.__xa_alloc
      0.00            +1.4        1.36 ± 10%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xa_erase.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat
      0.00            +1.4        1.37 ± 10%      +1.4        1.37 ± 12%  perf-profile.calltrace.cycles-pp.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat.__x64_sys_unlink
      0.00            +1.5        1.51 ± 17%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_expand.xas_create.xas_store.__xa_alloc.__xa_alloc_cyclic
      0.00            +1.6        1.62 ± 12%      +1.6        1.65 ± 12%  perf-profile.calltrace.cycles-pp.shmem_unlink.vfs_unlink.do_unlinkat.__x64_sys_unlink.do_syscall_64
      0.00            +2.8        2.80 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_create.xas_store.__xa_alloc.__xa_alloc_cyclic.simple_offset_add
      0.00            +2.9        2.94 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_store.__xa_alloc.__xa_alloc_cyclic.simple_offset_add.shmem_mknod
      5.38 ± 24%      +3.1        8.51 ± 11%      +0.6        5.95 ± 11%  perf-profile.calltrace.cycles-pp.lookup_open.open_last_lookups.path_openat.do_filp_open.do_sys_openat2
      6.08 ± 24%      +3.2        9.24 ± 12%      +0.5        6.59 ± 10%  perf-profile.calltrace.cycles-pp.open_last_lookups.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat
      0.00            +3.2        3.20 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__xa_alloc.__xa_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open
      0.00            +3.2        3.24 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__xa_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups
      0.00            +3.4        3.36 ± 14%      +1.2        1.16 ± 13%  perf-profile.calltrace.cycles-pp.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups.path_openat
      2.78 ± 25%      +3.4        6.17 ± 12%      +0.9        3.69 ± 12%  perf-profile.calltrace.cycles-pp.shmem_mknod.lookup_open.open_last_lookups.path_openat.do_filp_open
      0.16 ± 30%      -0.1        0.08 ± 20%      -0.0        0.13 ± 42%  perf-profile.children.cycles-pp.map_id_up
      0.22 ± 18%      -0.0        0.16 ± 17%      -0.1        0.12 ±  8%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.47 ± 17%      -0.0        0.43 ± 16%      +1.0        1.49 ± 10%  perf-profile.children.cycles-pp.__x64_sys_close
      0.02 ±142%      -0.0        0.00            +0.1        0.08 ± 22%  perf-profile.children.cycles-pp._find_next_zero_bit
      0.01 ±223%      -0.0        0.00            +2.0        1.97 ± 15%  perf-profile.children.cycles-pp.irq_exit_rcu
      0.00            +0.0        0.00            +0.1        0.08 ± 30%  perf-profile.children.cycles-pp.__wake_up
      0.00            +0.0        0.00            +0.1        0.08 ± 21%  perf-profile.children.cycles-pp.should_we_balance
      0.00            +0.0        0.00            +0.1        0.09 ± 34%  perf-profile.children.cycles-pp.amd_clear_divider
      0.00            +0.0        0.00            +0.1        0.10 ± 36%  perf-profile.children.cycles-pp.apparmor_current_getsecid_subj
      0.00            +0.0        0.00            +0.1        0.10 ± 32%  perf-profile.children.cycles-pp.filp_flush
      0.00            +0.0        0.00            +0.1        0.10 ± 27%  perf-profile.children.cycles-pp.mas_wr_end_piv
      0.00            +0.0        0.00            +0.1        0.12 ± 27%  perf-profile.children.cycles-pp.mnt_get_write_access
      0.00            +0.0        0.00            +0.1        0.12 ± 23%  perf-profile.children.cycles-pp.file_close_fd
      0.00            +0.0        0.00            +0.1        0.13 ± 30%  perf-profile.children.cycles-pp.security_current_getsecid_subj
      0.00            +0.0        0.00            +0.1        0.13 ± 18%  perf-profile.children.cycles-pp.native_apic_mem_eoi
      0.00            +0.0        0.00            +0.2        0.17 ± 22%  perf-profile.children.cycles-pp.mas_leaf_max_gap
      0.00            +0.0        0.00            +0.2        0.18 ± 24%  perf-profile.children.cycles-pp.mtree_range_walk
      0.00            +0.0        0.00            +0.2        0.20 ± 19%  perf-profile.children.cycles-pp.inode_set_ctime_current
      0.00            +0.0        0.00            +0.2        0.24 ± 14%  perf-profile.children.cycles-pp.ima_file_check
      0.00            +0.0        0.00            +0.2        0.24 ± 22%  perf-profile.children.cycles-pp.mas_anode_descend
      0.00            +0.0        0.00            +0.3        0.26 ± 18%  perf-profile.children.cycles-pp.lockref_put_return
      0.00            +0.0        0.00            +0.3        0.29 ± 16%  perf-profile.children.cycles-pp.mas_wr_walk
      0.00            +0.0        0.00            +0.3        0.31 ± 23%  perf-profile.children.cycles-pp.mas_update_gap
      0.00            +0.0        0.00            +0.3        0.32 ± 17%  perf-profile.children.cycles-pp.mas_wr_append
      0.00            +0.0        0.00            +0.4        0.37 ± 15%  perf-profile.children.cycles-pp.mas_empty_area
      0.00            +0.0        0.00            +0.5        0.47 ± 18%  perf-profile.children.cycles-pp.mas_wr_node_store
      0.00            +0.0        0.00            +0.5        0.53 ± 18%  perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
      0.00            +0.0        0.00            +0.7        0.65 ± 12%  perf-profile.children.cycles-pp.__memcg_slab_pre_alloc_hook
      0.00            +0.0        0.00            +0.7        0.65 ± 28%  perf-profile.children.cycles-pp.__memcg_slab_free_hook
      0.00            +0.0        0.00            +1.0        0.99 ± 13%  perf-profile.children.cycles-pp.mas_alloc_cyclic
      0.00            +0.0        0.00            +1.1        1.11 ± 14%  perf-profile.children.cycles-pp.mtree_alloc_cyclic
      0.00            +0.0        0.00            +1.2        1.21 ± 14%  perf-profile.children.cycles-pp.mas_erase
      0.00            +0.0        0.00            +1.4        1.35 ± 12%  perf-profile.children.cycles-pp.mtree_erase
      0.00            +0.0        0.00            +1.8        1.78 ±  9%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.00            +0.0        0.00            +4.1        4.06 ±  8%  perf-profile.children.cycles-pp.tick_nohz_highres_handler
      0.02 ±146%      +0.1        0.08 ± 13%      +0.0        0.06 ± 81%  perf-profile.children.cycles-pp.shmem_is_huge
      0.02 ±141%      +0.1        0.09 ± 16%      -0.0        0.00        perf-profile.children.cycles-pp.__list_del_entry_valid
      0.00            +0.1        0.08 ± 11%      +0.0        0.00        perf-profile.children.cycles-pp.free_unref_page
      0.00            +0.1        0.08 ± 13%      +0.1        0.08 ± 45%  perf-profile.children.cycles-pp.shmem_destroy_inode
      0.04 ±101%      +0.1        0.14 ± 25%      +0.0        0.05 ± 65%  perf-profile.children.cycles-pp.rcu_nocb_try_bypass
      0.00            +0.1        0.12 ± 27%      +0.0        0.00        perf-profile.children.cycles-pp.xas_find_marked
      0.02 ±144%      +0.1        0.16 ± 14%      -0.0        0.00        perf-profile.children.cycles-pp.__unfreeze_partials
      0.03 ±106%      +0.2        0.19 ± 26%      -0.0        0.03 ±136%  perf-profile.children.cycles-pp.xas_descend
      0.01 ±223%      +0.2        0.17 ± 15%      -0.0        0.00        perf-profile.children.cycles-pp.get_page_from_freelist
      0.11 ± 22%      +0.2        0.29 ± 16%      -0.0        0.08 ± 30%  perf-profile.children.cycles-pp.rcu_segcblist_enqueue
      0.02 ±146%      +0.2        0.24 ± 13%      -0.0        0.01 ±174%  perf-profile.children.cycles-pp.__alloc_pages
      0.36 ± 79%      +0.6        0.98 ± 15%      -0.0        0.31 ± 43%  perf-profile.children.cycles-pp.__slab_free
      0.50 ± 26%      +0.7        1.23 ± 14%      -0.2        0.31 ± 19%  perf-profile.children.cycles-pp.__call_rcu_common
      0.00            +0.8        0.82 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.radix_tree_node_rcu_free
      0.00            +1.1        1.14 ± 17%      +0.0        0.00        perf-profile.children.cycles-pp.radix_tree_node_ctor
      0.16 ± 86%      +1.2        1.38 ± 16%      -0.1        0.02 ±174%  perf-profile.children.cycles-pp.setup_object
      1.52 ± 25%      +1.2        2.75 ± 10%      +1.2        2.76 ± 11%  perf-profile.children.cycles-pp.vfs_unlink
      0.36 ± 22%      +1.3        1.63 ± 12%      +1.3        1.65 ± 12%  perf-profile.children.cycles-pp.shmem_unlink
      0.00            +1.3        1.30 ± 10%      +0.0        0.00        perf-profile.children.cycles-pp.__xa_erase
      0.20 ± 79%      +1.3        1.53 ± 15%      -0.2        0.02 ±173%  perf-profile.children.cycles-pp.shuffle_freelist
      0.00            +1.4        1.36 ± 10%      +0.0        0.00        perf-profile.children.cycles-pp.xa_erase
      0.00            +1.4        1.38 ± 10%      +1.4        1.37 ± 12%  perf-profile.children.cycles-pp.simple_offset_remove
      0.00            +1.5        1.51 ± 17%      +0.0        0.00        perf-profile.children.cycles-pp.xas_expand
      0.26 ± 78%      +1.6        1.87 ± 13%      -0.2        0.05 ± 68%  perf-profile.children.cycles-pp.allocate_slab
      0.40 ± 49%      +1.7        2.10 ± 13%      -0.3        0.14 ± 28%  perf-profile.children.cycles-pp.___slab_alloc
      1.30 ± 85%      +2.1        3.42 ± 12%      -0.1        1.15 ± 41%  perf-profile.children.cycles-pp.rcu_do_batch
      1.56 ± 27%      +2.4        3.93 ± 11%      -0.2        1.41 ± 12%  perf-profile.children.cycles-pp.kmem_cache_alloc_lru
      0.00            +2.4        2.44 ± 12%      +0.0        0.00        perf-profile.children.cycles-pp.xas_alloc
      2.66 ± 13%      +2.5        5.14 ±  5%      -2.7        0.00        perf-profile.children.cycles-pp.__irq_exit_rcu
     11.16 ± 10%      +2.7       13.88 ±  8%      +0.6       11.72 ±  8%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
     11.77 ± 10%      +2.7       14.49 ±  8%      +0.6       12.40 ±  8%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.00            +2.8        2.82 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.xas_create
      5.40 ± 24%      +3.1        8.52 ± 11%      +0.6        5.97 ± 11%  perf-profile.children.cycles-pp.lookup_open
      6.12 ± 24%      +3.1        9.27 ± 12%      +0.5        6.62 ± 10%  perf-profile.children.cycles-pp.open_last_lookups
      0.00            +3.2        3.22 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.__xa_alloc
      0.00            +3.2        3.24 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.__xa_alloc_cyclic
      0.00            +3.4        3.36 ± 14%      +1.2        1.16 ± 13%  perf-profile.children.cycles-pp.simple_offset_add
      2.78 ± 25%      +3.4        6.18 ± 12%      +0.9        3.70 ± 12%  perf-profile.children.cycles-pp.shmem_mknod
      0.00            +4.2        4.24 ± 12%      +0.0        0.00        perf-profile.children.cycles-pp.xas_store
      0.14 ± 27%      -0.1        0.08 ± 21%      -0.0        0.12 ± 42%  perf-profile.self.cycles-pp.map_id_up
      0.09 ± 18%      -0.0        0.06 ± 52%      +0.1        0.14 ± 12%  perf-profile.self.cycles-pp.obj_cgroup_charge
      0.18 ± 22%      -0.0        0.15 ± 17%      -0.1        0.10 ±  9%  perf-profile.self.cycles-pp._raw_spin_lock_irq
      0.02 ±141%      -0.0        0.00            +0.1        0.08 ± 22%  perf-profile.self.cycles-pp._find_next_zero_bit
      0.16 ± 24%      -0.0        0.16 ± 24%      -0.1        0.07 ± 64%  perf-profile.self.cycles-pp.__sysvec_apic_timer_interrupt
      0.02 ±146%      +0.0        0.02 ±146%      +0.1        0.08 ± 18%  perf-profile.self.cycles-pp.shmem_mknod
      0.00            +0.0        0.00            +0.1        0.09 ± 36%  perf-profile.self.cycles-pp.irq_exit_rcu
      0.00            +0.0        0.00            +0.1        0.09 ± 28%  perf-profile.self.cycles-pp.tick_nohz_highres_handler
      0.00            +0.0        0.00            +0.1        0.09 ± 36%  perf-profile.self.cycles-pp.apparmor_current_getsecid_subj
      0.00            +0.0        0.00            +0.1        0.09 ± 30%  perf-profile.self.cycles-pp.mtree_erase
      0.00            +0.0        0.00            +0.1        0.10 ± 26%  perf-profile.self.cycles-pp.mtree_alloc_cyclic
      0.00            +0.0        0.00            +0.1        0.10 ± 27%  perf-profile.self.cycles-pp.mas_wr_end_piv
      0.00            +0.0        0.00            +0.1        0.12 ± 28%  perf-profile.self.cycles-pp.mnt_get_write_access
      0.00            +0.0        0.00            +0.1        0.12 ± 29%  perf-profile.self.cycles-pp.inode_set_ctime_current
      0.00            +0.0        0.00            +0.1        0.12 ± 38%  perf-profile.self.cycles-pp.mas_empty_area
      0.00            +0.0        0.00            +0.1        0.13 ± 18%  perf-profile.self.cycles-pp.native_apic_mem_eoi
      0.00            +0.0        0.00            +0.1        0.14 ± 38%  perf-profile.self.cycles-pp.mas_update_gap
      0.00            +0.0        0.00            +0.1        0.14 ± 20%  perf-profile.self.cycles-pp.mas_wr_append
      0.00            +0.0        0.00            +0.2        0.16 ± 23%  perf-profile.self.cycles-pp.mas_leaf_max_gap
      0.00            +0.0        0.00            +0.2        0.18 ± 24%  perf-profile.self.cycles-pp.mtree_range_walk
      0.00            +0.0        0.00            +0.2        0.18 ± 29%  perf-profile.self.cycles-pp.mas_alloc_cyclic
      0.00            +0.0        0.00            +0.2        0.20 ± 14%  perf-profile.self.cycles-pp.__memcg_slab_pre_alloc_hook
      0.00            +0.0        0.00            +0.2        0.22 ± 32%  perf-profile.self.cycles-pp.mas_erase
      0.00            +0.0        0.00            +0.2        0.24 ± 35%  perf-profile.self.cycles-pp.__memcg_slab_free_hook
      0.00            +0.0        0.00            +0.2        0.24 ± 22%  perf-profile.self.cycles-pp.mas_anode_descend
      0.00            +0.0        0.00            +0.3        0.26 ± 17%  perf-profile.self.cycles-pp.lockref_put_return
      0.00            +0.0        0.00            +0.3        0.27 ± 16%  perf-profile.self.cycles-pp.mas_wr_walk
      0.00            +0.0        0.00            +0.3        0.34 ± 20%  perf-profile.self.cycles-pp.mas_wr_node_store
      0.00            +0.0        0.00            +0.4        0.35 ± 20%  perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
      0.00            +0.0        0.00            +1.6        1.59 ±  8%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.05 ± 85%      +0.0        0.06 ± 47%      +0.1        0.15 ± 54%  perf-profile.self.cycles-pp.check_heap_object
      0.05 ± 72%      +0.0        0.06 ± 81%      +0.0        0.10 ± 15%  perf-profile.self.cycles-pp.call_cpuidle
      0.04 ±105%      +0.0        0.04 ± 75%      +0.1        0.10 ± 33%  perf-profile.self.cycles-pp.putname
      0.00            +0.1        0.06 ± 24%      +0.0        0.05 ± 66%  perf-profile.self.cycles-pp.shmem_destroy_inode
      0.00            +0.1        0.07 ±  8%      +0.0        0.00        perf-profile.self.cycles-pp.__xa_alloc
      0.02 ±146%      +0.1        0.11 ± 28%      +0.0        0.02 ±133%  perf-profile.self.cycles-pp.rcu_nocb_try_bypass
      0.01 ±223%      +0.1        0.10 ± 28%      -0.0        0.00        perf-profile.self.cycles-pp.shuffle_freelist
      0.00            +0.1        0.11 ± 40%      +0.0        0.00        perf-profile.self.cycles-pp.xas_create
      0.00            +0.1        0.12 ± 27%      +0.0        0.00        perf-profile.self.cycles-pp.xas_find_marked
      0.00            +0.1        0.14 ± 18%      +0.0        0.00        perf-profile.self.cycles-pp.xas_alloc
      0.03 ±103%      +0.1        0.17 ± 29%      -0.0        0.03 ±136%  perf-profile.self.cycles-pp.xas_descend
      0.00            +0.2        0.16 ± 23%      +0.0        0.00        perf-profile.self.cycles-pp.xas_expand
      0.10 ± 22%      +0.2        0.27 ± 16%      -0.0        0.06 ± 65%  perf-profile.self.cycles-pp.rcu_segcblist_enqueue
      0.92 ± 35%      +0.3        1.22 ± 11%      -0.3        0.59 ± 15%  perf-profile.self.cycles-pp.kmem_cache_free
      0.00            +0.4        0.36 ± 16%      +0.0        0.00        perf-profile.self.cycles-pp.xas_store
      0.32 ± 30%      +0.4        0.71 ± 12%      -0.1        0.18 ± 23%  perf-profile.self.cycles-pp.__call_rcu_common
      0.18 ± 27%      +0.5        0.65 ±  8%      +0.1        0.26 ± 21%  perf-profile.self.cycles-pp.kmem_cache_alloc_lru
      0.36 ± 79%      +0.6        0.96 ± 15%      -0.0        0.31 ± 42%  perf-profile.self.cycles-pp.__slab_free
      0.00            +0.8        0.80 ± 14%      +0.0        0.00        perf-profile.self.cycles-pp.radix_tree_node_rcu_free
      0.00            +1.0        1.01 ± 16%      +0.0        0.00        perf-profile.self.cycles-pp.radix_tree_node_ctor



[1] https://lore.kernel.org/all/202402191308.8e7ee8c7-oliver.sang@intel.com/

> 
> 
> > > > > @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
> > > > >  	if (!inode || !S_ISDIR(inode->i_mode))
> > > > >  		return ret;
> > > > >  
> > > > > -	index = 2;
> > > > > +	index = DIR_OFFSET_MIN;
> > > > 
> > > > This bit should go into the simple_offset_empty() patch...
> > > > 
> > > > > @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
> > > > >  
> > > > >  	/* In this case, ->private_data is protected by f_pos_lock */
> > > > >  	file->private_data = NULL;
> > > > > -	return vfs_setpos(file, offset, U32_MAX);
> > > > > +	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
> > > > 					^^^
> > > > Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
> > > > quite right? Why not use ULONG_MAX here directly?
> > > 
> > > I initially changed U32_MAX to ULONG_MAX, but for some reason, the
> > > length checking in vfs_setpos() fails. There is probably a sign
> > > extension thing happening here that I don't understand.
> > > 
> > > 
> > > > Otherwise the patch looks good to me.
> > > 
> > > As always, thank you for your review.
> > > 
> > > 
> > > -- 
> > > Chuck Lever
> 
> -- 
> Chuck Lever

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ