lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date: Tue, 9 Jan 2024 22:03:56 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Sidhartha Kumar <sidhartha.kumar@...cle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>, "Liam R. Howlett"
	<Liam.Howlett@...cle.com>, Matthew Wilcox <willy@...radead.org>, Peng Zhang
	<zhangpeng.00@...edance.com>, <maple-tree@...ts.infradead.org>,
	<linux-mm@...ck.org>, <ying.huang@...el.com>, <feng.tang@...el.com>,
	<fengwei.yin@...el.com>, <oliver.sang@...el.com>
Subject: [linus:master] [maple_tree]  4249f13c11:  aim9.page_test.ops_per_sec
 3.5% improvement



Hello,

kernel test robot noticed a 3.5% improvement of aim9.page_test.ops_per_sec on:


commit: 4249f13c11be8b8b7bf93204185e150c3bdc968d ("maple_tree: do not preallocate nodes for slot stores")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: aim9
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
parameters:

	testtime: 300s
	test: page_test
	cpufreq_governor: performance






Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240109/202401091651.a189376-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/page_test/aim9/300s

commit: 
  e2c27b803b ("mm/filemap: avoid buffered read/write race to read inconsistent data")
  4249f13c11 ("maple_tree: do not preallocate nodes for slot stores")

e2c27b803bb66474 4249f13c11be8b8b7bf93204185 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    336518            +3.5%     348367        aim9.page_test.ops_per_sec
  95019000            +3.5%   98364469        aim9.time.minor_page_faults
     25318            +2.3%      25903        proc-vmstat.nr_active_anon
     26605            +2.2%      27197        proc-vmstat.nr_shmem
     25318            +2.3%      25903        proc-vmstat.nr_zone_active_anon
 1.087e+08            +3.3%  1.122e+08        proc-vmstat.numa_hit
 1.085e+08            +3.4%  1.121e+08        proc-vmstat.numa_local
 1.079e+08            +3.5%  1.117e+08        proc-vmstat.pgalloc_normal
  95763046            +3.5%   99109694        proc-vmstat.pgfault
 1.078e+08            +3.5%  1.116e+08        proc-vmstat.pgfree
  56340620            +1.4%   57128415        perf-stat.i.cache-references
   3744535            -7.4%    3468589        perf-stat.i.iTLB-load-misses
    923.85            +8.2%     999.87        perf-stat.i.instructions-per-iTLB-miss
    318120            +3.5%     329244        perf-stat.i.minor-faults
    318120            +3.5%     329244        perf-stat.i.page-faults
     12.48            -0.2       12.32        perf-stat.overall.cache-miss-rate%
    911.69            +8.5%     988.95        perf-stat.overall.instructions-per-iTLB-miss
  56153225            +1.4%   56938073        perf-stat.ps.cache-references
   3731915            -7.4%    3456934        perf-stat.ps.iTLB-load-misses
    317046            +3.5%     328134        perf-stat.ps.minor-faults
    317046            +3.5%     328134        perf-stat.ps.page-faults
      1.54 ± 15%      -0.9        0.61 ± 35%  perf-profile.calltrace.cycles-pp.mas_preallocate.do_brk_flags.__do_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.56 ± 16%      -0.9        0.67 ± 18%  perf-profile.children.cycles-pp.mas_preallocate
      0.59 ± 18%      -0.5        0.06 ± 66%  perf-profile.children.cycles-pp.mas_destroy
      0.03 ± 84%      +0.1        0.13 ± 26%  perf-profile.children.cycles-pp.anon_vma_interval_tree_insert
      0.18 ± 27%      +0.2        0.42 ± 15%  perf-profile.children.cycles-pp.vma_adjust_trans_huge
      0.28 ± 12%      +0.3        0.57 ± 14%  perf-profile.children.cycles-pp.vma_complete
      0.20 ± 28%      -0.1        0.13 ± 24%  perf-profile.self.cycles-pp.security_mmap_addr
      0.16 ± 23%      -0.1        0.10 ± 17%  perf-profile.self.cycles-pp.__perf_sw_event
      0.17 ± 18%      +0.1        0.27 ± 30%  perf-profile.self.cycles-pp.get_vma_policy
      0.02 ±118%      +0.1        0.13 ± 26%  perf-profile.self.cycles-pp.anon_vma_interval_tree_insert
      0.08 ± 25%      +0.2        0.24 ± 13%  perf-profile.self.cycles-pp.vma_complete
      0.18 ± 28%      +0.2        0.42 ± 15%  perf-profile.self.cycles-pp.vma_adjust_trans_huge




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ