lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202406281308.6137dbb1-oliver.sang@intel.com>
Date: Fri, 28 Jun 2024 13:13:26 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Jan Kara <jack@...e.cz>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Trond Myklebust <trond.myklebust@...merspace.com>,
	<linux-nfs@...r.kernel.org>, <ying.huang@...el.com>, <feng.tang@...el.com>,
	<fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [nfs]  a527c3ba41:  filebench.sum_operations/s 180.4%
 improvement



Hello,

kernel test robot noticed a 180.4% improvement of filebench.sum_operations/s on:


commit: a527c3ba41c4c61e2069bfce4091e5515f06a8dd ("nfs: Avoid flushing many pages with NFS_FILE_SYNC")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: filebench
test machine: 96 threads 2 sockets Intel(R) Xeon(R) Platinum 8260L CPU @ 2.40GHz (Cascade Lake) with 128G memory
parameters:

	disk: 1HDD
	fs: btrfs
	fs2: nfsv4
	test: filemicro_rwritefsync.f
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240628/202406281308.6137dbb1-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/disk/fs2/fs/kconfig/rootfs/tbox_group/test/testcase:
  gcc-13/performance/1HDD/nfsv4/btrfs/x86_64-rhel-8.3/debian-12-x86_64-20240206.cgz/lkp-csl-2sp3/filemicro_rwritefsync.f/filebench

commit: 
  134d0b3f24 ("nfs: propagate readlink errors in nfs_symlink_filler")
  a527c3ba41 ("nfs: Avoid flushing many pages with NFS_FILE_SYNC")

134d0b3f2440cddd a527c3ba41c4c61e2069bfce409 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      2.06           -32.0%       1.40 ±  3%  iostat.cpu.iowait
  7.17e+10 ±  3%     -62.2%  2.708e+10 ±  2%  cpuidle..time
   3361646           -40.5%    2001605        cpuidle..usage
    797.57 ±  3%     -58.3%     332.24 ±  2%  uptime.boot
     74461 ±  3%     -58.5%      30930 ±  2%  uptime.idle
    986.05 ± 52%    -100.0%       0.00        numa-meminfo.node0.Mlocked
     41610 ± 32%     -59.2%      16976 ± 79%  numa-meminfo.node0.Shmem
     64815 ±  3%     -17.0%      53823 ±  2%  numa-meminfo.node1.Active(anon)
    989020 ± 10%     -49.7%     497288 ± 49%  numa-numastat.node0.local_node
   1031591 ± 10%     -46.8%     549069 ± 43%  numa-numastat.node0.numa_hit
   1104745 ± 11%     -29.8%     775663 ± 28%  numa-numastat.node1.local_node
   1161905 ±  9%     -29.1%     823337 ± 26%  numa-numastat.node1.numa_hit
      2170 ±  3%     +91.4%       4154 ±  2%  vmstat.io.bo
      1.99           -32.0%       1.35 ±  3%  vmstat.procs.b
      2060           +23.5%       2543 ±  2%  vmstat.system.cs
      4540 ±  2%     +80.5%       8197 ±  2%  vmstat.system.in
      2.07            -0.7        1.41 ±  3%  mpstat.cpu.all.iowait%
      0.06 ±  3%      +0.1        0.15 ±  3%  mpstat.cpu.all.irq%
      0.01 ±  2%      +0.0        0.02 ±  2%  mpstat.cpu.all.soft%
      0.05 ±  6%      +0.0        0.07 ±  5%  mpstat.cpu.all.sys%
      0.05 ±  2%      +0.1        0.12 ±  2%  mpstat.cpu.all.usr%
      0.37 ± 10%      -0.1        0.30 ± 10%  perf-profile.children.cycles-pp.perf_event_task_tick
      0.15 ± 16%      -0.0        0.11 ± 17%  perf-profile.children.cycles-pp.rcu_core
      0.16 ± 13%      +0.1        0.21 ±  5%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.24 ± 12%      -0.1        0.19 ± 11%  perf-profile.self.cycles-pp.perf_event_task_tick
      0.14 ± 15%      +0.0        0.19 ± 10%  perf-profile.self.cycles-pp.cpuidle_governor_latency_req
      2.10          +184.9%       5.98 ±  6%  filebench.sum_bytes_mb/s
    273.04          +180.4%     765.61 ±  6%  filebench.sum_operations/s
      0.00 ± 10%  +27680.8%       1.20 ±  7%  filebench.sum_time_ms/op
    273.00          +180.4%     765.50 ±  6%  filebench.sum_writes/s
    746.84 ±  3%     -62.3%     281.62 ±  2%  filebench.time.elapsed_time
    746.84 ±  3%     -62.3%     281.62 ±  2%  filebench.time.elapsed_time.max
    246.64 ± 52%    -100.0%       0.00        numa-vmstat.node0.nr_mlock
     10402 ± 32%     -59.2%       4243 ± 79%  numa-vmstat.node0.nr_shmem
   1031364 ± 10%     -46.8%     548226 ± 43%  numa-vmstat.node0.numa_hit
    988793 ± 10%     -49.8%     496445 ± 49%  numa-vmstat.node0.numa_local
     16202 ±  3%     -17.0%      13454 ±  2%  numa-vmstat.node1.nr_active_anon
     16202 ±  3%     -17.0%      13454 ±  2%  numa-vmstat.node1.nr_zone_active_anon
   1161422 ±  9%     -29.2%     822014 ± 26%  numa-vmstat.node1.numa_hit
   1104280 ± 11%     -29.9%     774340 ± 28%  numa-vmstat.node1.numa_local
    169724           -34.1%     111875        meminfo.Active
     71034           -19.6%      57108        meminfo.Active(anon)
     98690           -44.5%      54766 ±  2%  meminfo.Active(file)
    386512 ± 18%     -46.6%     206514 ± 22%  meminfo.AnonHugePages
    100539 ±  4%    +163.6%     264992 ±  2%  meminfo.Dirty
     67198           -12.6%      58722        meminfo.Mapped
      1426 ±  2%    -100.0%       0.00        meminfo.Mlocked
    113320           -20.9%      89605        meminfo.Shmem
    295425 ±  4%    +125.3%     665456        meminfo.Writeback
     17758           -19.6%      14279        proc-vmstat.nr_active_anon
     24673           -44.5%      13701        proc-vmstat.nr_active_file
    165207            -2.3%     161474        proc-vmstat.nr_anon_pages
    188.72 ± 18%     -46.6%     100.85 ± 22%  proc-vmstat.nr_anon_transparent_hugepages
    641612            -8.4%     587844        proc-vmstat.nr_dirtied
     25122 ±  4%    +163.5%      66189 ±  2%  proc-vmstat.nr_dirty
   1359330            -2.5%    1325284        proc-vmstat.nr_file_pages
    174858            -3.5%     168725        proc-vmstat.nr_inactive_anon
    523188            -3.3%     506043        proc-vmstat.nr_inactive_file
     18536            +3.8%      19247        proc-vmstat.nr_kernel_stack
     17058           -12.4%      14939        proc-vmstat.nr_mapped
    356.48 ±  2%    -100.0%       0.00        proc-vmstat.nr_mlock
     28336           -20.9%      22408        proc-vmstat.nr_shmem
     73898 ±  4%    +125.0%     166281        proc-vmstat.nr_writeback
    640947            -8.4%     587183        proc-vmstat.nr_written
     17758           -19.6%      14279        proc-vmstat.nr_zone_active_anon
     24673           -44.5%      13701        proc-vmstat.nr_zone_active_file
    174858            -3.5%     168725        proc-vmstat.nr_zone_inactive_anon
    523188            -3.3%     506043        proc-vmstat.nr_zone_inactive_file
     41988 ±  3%    +100.4%      84132 ±  2%  proc-vmstat.nr_zone_write_pending
   2195708 ±  5%     -37.4%    1375336 ±  6%  proc-vmstat.numa_hit
   2095965 ±  5%     -39.2%    1274986 ±  7%  proc-vmstat.numa_local
     46641           -13.7%      40252        proc-vmstat.pgactivate
   2637615 ±  4%     -32.5%    1780826 ±  5%  proc-vmstat.pgalloc_normal
   1924711 ±  3%     -56.3%     841690 ±  3%  proc-vmstat.pgfault
   2504198 ±  7%     -32.5%    1691266 ± 13%  proc-vmstat.pgfree
   1624850           -27.2%    1182486        proc-vmstat.pgpgout
     89895 ±  2%     -55.4%      40062 ±  5%  proc-vmstat.pgreuse
      2.43            +7.0%       2.60 ±  3%  perf-stat.i.MPKI
  67435645 ±  2%    +112.8%  1.435e+08 ±  2%  perf-stat.i.branch-instructions
      4.56            -0.1        4.44        perf-stat.i.branch-miss-rate%
   3862446 ±  2%    +125.0%    8689890 ±  3%  perf-stat.i.branch-misses
      4.97            +2.4        7.33 ±  2%  perf-stat.i.cache-miss-rate%
    540701 ±  3%     +86.6%    1009040 ±  2%  perf-stat.i.cache-misses
   7966602           +24.1%    9887537        perf-stat.i.cache-references
      2039           +22.7%       2502 ±  2%  perf-stat.i.context-switches
  4.97e+08 ±  2%     +91.0%  9.495e+08 ±  3%  perf-stat.i.cpu-cycles
    101.96            +4.2%     106.25        perf-stat.i.cpu-migrations
      1037           +10.6%       1147 ±  3%  perf-stat.i.cycles-between-cache-misses
 3.314e+08 ±  2%    +112.2%  7.033e+08 ±  2%  perf-stat.i.instructions
      0.50           +11.5%       0.56        perf-stat.i.ipc
      2.11           -99.2%       0.02 ±  9%  perf-stat.i.metric.K/sec
      2466           +12.6%       2776 ±  2%  perf-stat.i.minor-faults
      2466           +12.6%       2776 ±  2%  perf-stat.i.page-faults
      1.63 ±  3%     -12.0%       1.43 ±  3%  perf-stat.overall.MPKI
      5.73            +0.3        6.05        perf-stat.overall.branch-miss-rate%
      6.79 ±  3%      +3.4       10.21 ±  2%  perf-stat.overall.cache-miss-rate%
      1.50           -10.0%       1.35        perf-stat.overall.cpi
      0.67           +11.1%       0.74        perf-stat.overall.ipc
  67362570 ±  2%    +112.3%   1.43e+08 ±  2%  perf-stat.ps.branch-instructions
   3858126 ±  2%    +124.5%    8659606 ±  3%  perf-stat.ps.branch-misses
    539904 ±  3%     +86.2%    1005142 ±  2%  perf-stat.ps.cache-misses
   7952547           +23.8%    9844369        perf-stat.ps.cache-references
      2036           +22.5%       2494 ±  2%  perf-stat.ps.context-switches
 4.966e+08 ±  2%     +90.7%  9.468e+08 ±  3%  perf-stat.ps.cpu-cycles
    101.81            +4.0%     105.85        perf-stat.ps.cpu-migrations
 3.311e+08 ±  2%    +111.7%   7.01e+08 ±  2%  perf-stat.ps.instructions
      2461           +12.2%       2762 ±  2%  perf-stat.ps.minor-faults
      2461           +12.2%       2762 ±  2%  perf-stat.ps.page-faults
 2.475e+11           -20.0%   1.98e+11        perf-stat.total.instructions
      0.04 ±  4%     +31.6%       0.05 ±  8%  sched_debug.cfs_rq:/.h_nr_running.avg
     20.10 ± 14%     +60.4%      32.25 ± 20%  sched_debug.cfs_rq:/.load_avg.avg
      0.04 ±  3%     +31.7%       0.05 ±  8%  sched_debug.cfs_rq:/.nr_running.avg
      7.67 ± 37%    +146.1%      18.87 ± 29%  sched_debug.cfs_rq:/.removed.load_avg.avg
      3.51 ± 42%    +138.3%       8.37 ± 29%  sched_debug.cfs_rq:/.removed.runnable_avg.avg
      3.51 ± 42%    +138.3%       8.37 ± 29%  sched_debug.cfs_rq:/.removed.util_avg.avg
     38.15 ±  6%    +101.4%      76.85 ±  8%  sched_debug.cfs_rq:/.runnable_avg.avg
     98.16 ±  5%     +39.5%     136.95 ± 11%  sched_debug.cfs_rq:/.runnable_avg.stddev
     37.92 ±  6%    +101.5%      76.42 ±  7%  sched_debug.cfs_rq:/.util_avg.avg
    656.80 ±  4%     +18.8%     780.15 ± 15%  sched_debug.cfs_rq:/.util_avg.max
     97.57 ±  5%     +39.9%     136.52 ± 11%  sched_debug.cfs_rq:/.util_avg.stddev
      3.28 ± 25%     +98.3%       6.50 ± 40%  sched_debug.cfs_rq:/.util_est.avg
    123.73 ± 11%     +45.4%     179.95 ± 12%  sched_debug.cfs_rq:/.util_est.max
     17.32 ± 11%     +68.8%      29.24 ± 22%  sched_debug.cfs_rq:/.util_est.stddev
    389566 ±  7%     -57.8%     164509 ±  7%  sched_debug.cpu.clock.avg
    389580 ±  7%     -57.8%     164520 ±  7%  sched_debug.cpu.clock.max
    389555 ±  7%     -57.8%     164499 ±  7%  sched_debug.cpu.clock.min
      8.38 ± 16%     -27.1%       6.11 ± 12%  sched_debug.cpu.clock.stddev
    388964 ±  7%     -57.8%     164063 ±  7%  sched_debug.cpu.clock_task.avg
    389309 ±  7%     -57.8%     164329 ±  7%  sched_debug.cpu.clock_task.max
    381467 ±  7%     -58.9%     156750 ±  7%  sched_debug.cpu.clock_task.min
     12392 ±  5%     -46.0%       6695 ±  4%  sched_debug.cpu.curr->pid.max
      1368 ±  6%     -36.2%     872.62 ±  4%  sched_debug.cpu.curr->pid.stddev
      0.03 ± 10%     +52.8%       0.04 ± 10%  sched_debug.cpu.nr_running.avg
      0.15 ±  7%     +16.5%       0.17 ±  5%  sched_debug.cpu.nr_running.stddev
      9004 ±  5%     -47.9%       4694 ±  6%  sched_debug.cpu.nr_switches.avg
     77261 ± 22%     -40.5%      46007 ±  9%  sched_debug.cpu.nr_switches.max
      1542 ±  6%     -54.4%     702.61 ±  8%  sched_debug.cpu.nr_switches.min
     10459 ± 10%     -40.6%       6217 ±  6%  sched_debug.cpu.nr_switches.stddev
      0.07 ±  5%     -73.2%       0.02 ± 17%  sched_debug.cpu.nr_uninterruptible.avg
    389570 ±  7%     -57.8%     164510 ±  7%  sched_debug.cpu_clk
    388998 ±  7%     -57.9%     163938 ±  7%  sched_debug.ktime
    390127 ±  7%     -57.7%     165072 ±  7%  sched_debug.sched_clk




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ