lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 1 Jul 2021 10:08:31 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>
Cc:     0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...ts.01.org, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com, linux-mm@...ck.org,
        akpm@...ux-foundation.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>
Subject: [mm]  c0af5203b9:  will-it-scale.per_process_ops 7.4% improvement



Greeting,

FYI, we noticed a 7.4% improvement of will-it-scale.per_process_ops due to commit:


commit: c0af5203b9dbf4cd8b424298b1cb809f4535802a ("[PATCH 2/2] mm: Change p4d_page_vaddr to return pud_t *")
url: https://github.com/0day-ci/linux/commits/Aneesh-Kumar-K-V/mm-Change-pud_page_vaddr-to-return-pmd_t/20210617-045835
base: https://git.kernel.org/cgit/linux/kernel/git/powerpc/linux.git next

in testcase: will-it-scale
on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
with following parameters:

	nr_task: 16
	mode: process
	test: mmap1
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml  # generate the yaml file for lkp run
        bin/lkp run                    generated-yaml-file

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/16/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/mmap1/will-it-scale/0x5003006

commit: 
  edb4f1ceb1 ("mm: Change pud_page_vaddr to return pmd_t *")
  c0af5203b9 ("mm: Change p4d_page_vaddr to return pud_t *")

edb4f1ceb161c083 c0af5203b9dbf4cd8b424298b1c 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  10634491            +7.4%   11422635        will-it-scale.16.processes
    664655            +7.4%     713914        will-it-scale.per_process_ops
  10634491            +7.4%   11422635        will-it-scale.workload
      6236 ± 11%     +34.6%       8391 ± 21%  softirqs.CPU59.RCU
 2.473e+10            +7.4%  2.656e+10        perf-stat.i.branch-instructions
      0.46 ±  3%      -6.9%       0.43        perf-stat.i.cpi
 2.535e+10            +7.6%  2.727e+10        perf-stat.i.dTLB-loads
 1.149e+10            +7.3%  1.232e+10        perf-stat.i.dTLB-stores
 1.027e+11            +7.4%  1.103e+11        perf-stat.i.instructions
      2.21            +6.7%       2.36        perf-stat.i.ipc
    699.56            +7.5%     751.76        perf-stat.i.metric.M/sec
      0.45            -6.3%       0.42        perf-stat.overall.cpi
      2.22            +6.7%       2.37        perf-stat.overall.ipc
 2.464e+10            +7.4%  2.647e+10        perf-stat.ps.branch-instructions
 2.526e+10            +7.6%  2.718e+10        perf-stat.ps.dTLB-loads
 1.145e+10            +7.3%  1.228e+10        perf-stat.ps.dTLB-stores
 1.024e+11            +7.4%  1.099e+11        perf-stat.ps.instructions
 3.095e+13            +7.3%  3.321e+13        perf-stat.total.instructions
     47.73            -4.1       43.67 ±  9%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.__munmap
     47.39            -4.1       43.34 ±  9%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     46.85            -4.1       42.79 ±  9%  perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     44.68            -4.0       40.64 ±  9%  perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.45            -4.0       42.45 ±  9%  perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.__munmap
     37.90            -4.0       33.91 ±  9%  perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
     10.29            -2.7        7.55 ±  9%  perf-profile.calltrace.cycles-pp.___might_sleep.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
     27.98            -2.5       25.48 ±  9%  perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
     27.18            -2.5       24.70 ±  9%  perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
      6.98            -1.6        5.40 ±  9%  perf-profile.calltrace.cycles-pp.free_pgd_range.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
     46.86            -4.1       42.80 ±  9%  perf-profile.children.cycles-pp.__x64_sys_munmap
     44.74            -4.0       40.70 ±  9%  perf-profile.children.cycles-pp.__do_munmap
     46.47            -4.0       42.46 ±  9%  perf-profile.children.cycles-pp.__vm_munmap
     37.98            -4.0       33.98 ±  9%  perf-profile.children.cycles-pp.unmap_region
     11.13            -2.7        8.46 ±  9%  perf-profile.children.cycles-pp.___might_sleep
     27.26            -2.5       24.75 ±  9%  perf-profile.children.cycles-pp.unmap_page_range
     28.02            -2.5       25.52 ±  9%  perf-profile.children.cycles-pp.unmap_vmas
      7.02            -1.6        5.43 ±  9%  perf-profile.children.cycles-pp.free_pgd_range
      0.71 ±  3%      -0.3        0.42 ± 18%  perf-profile.children.cycles-pp.security_vm_enough_memory_mm
      0.61 ±  3%      -0.3        0.32 ± 22%  perf-profile.children.cycles-pp.cap_vm_enough_memory
      0.24 ±  6%      -0.1        0.15 ± 17%  perf-profile.children.cycles-pp.cap_capable
      0.26 ±  5%      -0.1        0.20 ± 11%  perf-profile.children.cycles-pp.refill_obj_stock
      0.05 ± 44%      +0.0        0.08 ± 14%  perf-profile.children.cycles-pp.tlb_table_flush
      0.17 ±  4%      +0.1        0.29 ±  8%  perf-profile.children.cycles-pp.tlb_flush_mmu
     10.98            -2.6        8.35 ±  9%  perf-profile.self.cycles-pp.___might_sleep
      6.98            -1.6        5.40 ±  9%  perf-profile.self.cycles-pp.free_pgd_range
      0.36 ±  5%      -0.2        0.17 ± 27%  perf-profile.self.cycles-pp.cap_vm_enough_memory
      0.22 ±  7%      -0.1        0.13 ± 19%  perf-profile.self.cycles-pp.cap_capable
      0.50 ±  5%      -0.1        0.41 ± 10%  perf-profile.self.cycles-pp.security_mmap_file
      0.24 ±  5%      -0.1        0.18 ± 11%  perf-profile.self.cycles-pp.refill_obj_stock
      0.10 ±  6%      +0.1        0.23 ±  8%  perf-profile.self.cycles-pp.tlb_flush_mmu


                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  720000 +------------------------------------------------------------------+   
         |                                            OO O O OO O OO O O OO |   
  710000 |-+                                                                |   
  700000 |-+                                                                |   
         |                                                                  |   
  690000 |-+                                                                |   
         |                                                                  |   
  680000 |-OO O O  O   OO O O OO   O  O   O  O    O                         |   
         |        O  O           O  O   O  O   O O  O                       |   
  670000 |.+ .+. .++.+.++.+              .++.+.+.++.+. +.+.+.++.            |   
  660000 |-++   +          :          +.+             +         +.++.+.+.+  |   
         |                 :         +                                      |   
  650000 |-+                :  +.   +                                       |   
         |                  +.+  +.+                                        |   
  640000 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.13.0-rc2-00045-gc0af5203b9db" of type "text/plain" (174118 bytes)

View attachment "job-script" of type "text/plain" (8006 bytes)

View attachment "job.yaml" of type "text/plain" (5453 bytes)

View attachment "reproduce" of type "text/plain" (337 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ