lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202311301346.56b0fcd6-oliver.sang@intel.com>
Date:   Thu, 30 Nov 2023 13:49:21 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Zhang Rui <rui.zhang@...el.com>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        <linux-acpi@...r.kernel.org>, <ying.huang@...el.com>,
        <feng.tang@...el.com>, <fengwei.yin@...el.com>,
        <oliver.sang@...el.com>
Subject: [linus:master] [x86/acpi]  ec9aedb2aa:  aim9.exec_test.ops_per_sec
 2.4% improvement



Hello,

kernel test robot noticed a 2.4% improvement of aim9.exec_test.ops_per_sec on:


commit: ec9aedb2aa1ab7ac420c00b31f5edc5be15ec167 ("x86/acpi: Ignore invalid x2APIC entries")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: aim9
test machine: 48 threads 2 sockets Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz (Ivy Bridge-EP) with 112G memory
parameters:

	testtime: 300s
	test: exec_test
	cpufreq_governor: performance


besides below detailed comparison, we also noticed some difference from dmesg.

for this commit ec9aedb2aa:

[    1.311075][    T0] smpboot: Allowing 48 CPUs, 0 hotplug CPUs

for parent:

[    1.311098][    T0] smpboot: Allowing 168 CPUs, 120 hotplug CPUs


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20231130/202311301346.56b0fcd6-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/exec_test/aim9/300s

commit: 
  31255e072b ("x86/shstk: Delay signal entry SSP write until after user accesses")
  ec9aedb2aa ("x86/acpi: Ignore invalid x2APIC entries")

31255e072b2e91f9 ec9aedb2aa1ab7ac420c00b31f5 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      8587 ±  3%      +5.9%       9091        vmstat.system.cs
      6542 ±  9%     -18.2%       5352 ±  7%  numa-meminfo.node1.KernelStack
     57960 ±  4%     -12.6%      50656 ±  6%  numa-meminfo.node1.SUnreclaim
      6541 ±  9%     -18.0%       5363 ±  6%  numa-vmstat.node1.nr_kernel_stack
     14490 ±  4%     -12.6%      12663 ±  6%  numa-vmstat.node1.nr_slab_unreclaimable
    179678 ±  7%     -22.6%     139060 ± 10%  meminfo.DirectMap4k
     13670           -13.6%      11809        meminfo.KernelStack
     78243           -72.5%      21498        meminfo.Percpu
      1222            +2.4%       1251        aim9.exec_test.ops_per_sec
  27978802            +3.1%   28859909        aim9.time.minor_page_faults
    175.04            -6.2%     164.11        aim9.time.system_time
    115.72            +9.1%     126.24        aim9.time.user_time
    731948            +2.4%     749684        aim9.time.voluntary_context_switches
     13669           -13.8%      11788        proc-vmstat.nr_kernel_stack
     21028            -3.2%      20355        proc-vmstat.nr_slab_reclaimable
     29074            -9.0%      26443        proc-vmstat.nr_slab_unreclaimable
     50357            -1.3%      49699        proc-vmstat.numa_other
  28937047            +3.0%   29790891        proc-vmstat.pgfault
      0.55 ±  5%      +0.1        0.65 ±  7%  perf-profile.calltrace.cycles-pp.next_uptodate_folio.filemap_map_pages.do_read_fault.do_fault.__handle_mm_fault
      1.38 ±  6%      -0.7        0.67 ±  9%  perf-profile.children.cycles-pp.mm_init
      0.87 ±  7%      -0.5        0.38 ± 10%  perf-profile.children.cycles-pp.pcpu_alloc
      0.76 ±  8%      -0.3        0.42 ±  8%  perf-profile.children.cycles-pp.alloc_bprm
      0.50 ±  6%      -0.3        0.17 ±  6%  perf-profile.children.cycles-pp.memset_orig
      0.40 ±  5%      -0.2        0.15 ± 18%  perf-profile.children.cycles-pp.__percpu_counter_init_many
      0.15 ± 20%      -0.1        0.03 ±101%  perf-profile.children.cycles-pp.mm_init_cid
      0.23 ± 14%      -0.1        0.12 ± 19%  perf-profile.children.cycles-pp._find_next_bit
      0.30 ± 10%      -0.1        0.24 ± 16%  perf-profile.children.cycles-pp.mas_preallocate
      0.14 ± 18%      -0.0        0.09 ± 16%  perf-profile.children.cycles-pp.pm_qos_read_value
      0.09 ± 15%      -0.0        0.07 ± 10%  perf-profile.children.cycles-pp.remove_vma
      0.05 ± 47%      +0.1        0.11 ± 26%  perf-profile.children.cycles-pp.malloc
      0.20 ± 22%      +0.1        0.25 ±  7%  perf-profile.children.cycles-pp.do_brk_flags
      0.44 ±  5%      +0.1        0.53 ±  8%  perf-profile.children.cycles-pp.mod_objcg_state
      0.80 ±  4%      +0.2        0.96 ±  6%  perf-profile.children.cycles-pp.next_uptodate_folio
      0.50 ±  7%      -0.3        0.17 ±  6%  perf-profile.self.cycles-pp.memset_orig
      0.26 ± 16%      -0.2        0.04 ±106%  perf-profile.self.cycles-pp.mm_init
      0.14 ± 25%      -0.1        0.03 ±100%  perf-profile.self.cycles-pp.mm_init_cid
      0.18 ± 22%      -0.1        0.08 ± 34%  perf-profile.self.cycles-pp.pcpu_alloc
      0.13 ± 16%      -0.0        0.08 ± 20%  perf-profile.self.cycles-pp.pm_qos_read_value
      0.37 ±  6%      +0.1        0.45 ± 10%  perf-profile.self.cycles-pp.mod_objcg_state
      0.66 ±  5%      +0.1        0.80 ±  6%  perf-profile.self.cycles-pp.next_uptodate_folio
  34087721 ±  2%      +3.6%   35301961        perf-stat.i.branch-misses
      8601 ±  3%      +6.1%       9122        perf-stat.i.context-switches
     72.92 ±  2%      +7.4%      78.30 ±  3%  perf-stat.i.cpu-migrations
      1.55 ±  2%      -0.1        1.42 ±  3%  perf-stat.i.dTLB-load-miss-rate%
      0.51 ±  2%      -0.2        0.32        perf-stat.i.dTLB-store-miss-rate%
   2867856 ±  3%     -36.9%    1810983        perf-stat.i.dTLB-store-misses
 5.561e+08 ±  2%      +3.0%   5.73e+08        perf-stat.i.dTLB-stores
     92019 ±  4%     +10.2%     101371        perf-stat.i.iTLB-loads
    126.43 ± 15%     -33.8%      83.76        perf-stat.i.metric.K/sec
     90050 ±  4%      +6.8%      96193        perf-stat.i.minor-faults
     19.22 ±  4%      -1.5       17.77 ±  3%  perf-stat.i.node-store-miss-rate%
     90050 ±  4%      +6.8%      96194        perf-stat.i.page-faults
      1.48 ±  2%      -0.1        1.38 ±  3%  perf-stat.overall.dTLB-load-miss-rate%
      0.51            -0.2        0.32        perf-stat.overall.dTLB-store-miss-rate%
  33982829 ±  2%      +3.5%   35183134        perf-stat.ps.branch-misses
      8573 ±  3%      +6.0%       9090        perf-stat.ps.context-switches
     72.73 ±  2%      +7.4%      78.13 ±  3%  perf-stat.ps.cpu-migrations
   2858954 ±  3%     -36.9%    1805251        perf-stat.ps.dTLB-store-misses
 5.545e+08 ±  2%      +3.0%  5.712e+08        perf-stat.ps.dTLB-stores
     91889 ±  4%     +10.2%     101265        perf-stat.ps.iTLB-loads
     89770 ±  4%      +6.8%      95880        perf-stat.ps.minor-faults
     89771 ±  4%      +6.8%      95880        perf-stat.ps.page-faults



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ