lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20211005140519.GD15539@xsang-OptiPlex-9020>
Date:   Tue, 5 Oct 2021 22:05:19 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Minchan Kim <minchan@...nel.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        kernel test robot <oliver.sang@...el.com>,
        Chris Goldsworthy <cgoldswo@...eaurora.org>,
        "Xing, Zhengjun" <zhengjun.xing@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com
Subject: [mm]  243418e392:  will-it-scale.per_process_ops 3.0% improvement



Greeting,

FYI, we noticed a 3.0% improvement of will-it-scale.per_process_ops due to commit:


commit: 243418e3925d5b5b0657ae54c322d43035e97eed ("mm: fs: invalidate bh_lrus for only cold path")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: will-it-scale
on test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: brk1
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap2/brk1/will-it-scale/0x5003006

commit: 
  b7cd9fa5cc ("lib/zlib_inflate/inffast: check config in C to avoid unused function warning")
  243418e392 ("mm: fs: invalidate bh_lrus for only cold path")

b7cd9fa5ccc392d9 243418e3925d5b5b0657ae54c32 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
 2.068e+08            +3.0%  2.131e+08        will-it-scale.192.processes
   1077243            +3.0%    1109687        will-it-scale.per_process_ops
 2.068e+08            +3.0%  2.131e+08        will-it-scale.workload
    759.50 ± 67%    +596.6%       5290 ± 55%  interrupts.CPU28.RES:Rescheduling_interrupts
     12703            -8.6%      11607        perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.preempt_schedule_common.__cond_resched.unmap_vmas
     28481 ±  6%     +11.7%      31799 ±  9%  softirqs.CPU28.RCU
     32901 ±  5%     +10.3%      36284 ±  3%  softirqs.CPU54.RCU
      0.18 ± 24%      -0.0        0.15 ±  2%  perf-stat.i.branch-miss-rate%
 1.083e+11 ±  3%      +2.4%  1.108e+11        perf-stat.i.dTLB-loads
      0.16 ±  3%      -0.0        0.15 ±  2%  perf-stat.overall.branch-miss-rate%
      0.00            +0.0        0.00        perf-stat.overall.dTLB-store-miss-rate%
      2813 ±  6%     +11.5%       3137 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
    565816            -3.3%     547233        perf-stat.overall.path-length
 1.079e+11 ±  3%      +2.4%  1.105e+11        perf-stat.ps.dTLB-loads
     17.76            -2.3       15.48        perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.73            -2.0        0.76 ±  5%  perf-profile.calltrace.cycles-pp.lru_add_drain.unmap_region.__do_munmap.__x64_sys_brk.do_syscall_64
     35.83            -1.6       34.23        perf-profile.calltrace.cycles-pp.__do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     78.46            -0.6       77.82        perf-profile.calltrace.cycles-pp.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     81.63            -0.5       81.10        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     83.66            -0.5       83.20        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.brk
      7.49            -0.4        7.10        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__x64_sys_brk
      9.06            -0.4        8.68        perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__x64_sys_brk.do_syscall_64
      4.02            -0.1        3.90        perf-profile.calltrace.cycles-pp.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
      1.68            -0.1        1.61        perf-profile.calltrace.cycles-pp.tlb_gather_mmu.unmap_region.__do_munmap.__x64_sys_brk.do_syscall_64
      1.08            +0.0        1.11        perf-profile.calltrace.cycles-pp.__vma_rb_erase.__do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.77            +0.0        0.80        perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
      1.13            +0.0        1.16        perf-profile.calltrace.cycles-pp.up_read.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
      1.37            +0.0        1.42        perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
      0.61            +0.1        0.66        perf-profile.calltrace.cycles-pp.sync_mm_rss.zap_pte_range.unmap_page_range.unmap_vmas.unmap_region
      1.27            +0.1        1.34        perf-profile.calltrace.cycles-pp.vmacache_find.find_vma.__do_munmap.__x64_sys_brk.do_syscall_64
      1.84            +0.1        1.93        perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.__do_munmap.__x64_sys_brk.do_syscall_64
      2.87            +0.1        2.98        perf-profile.calltrace.cycles-pp.down_write_killable.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
      5.98            +0.2        6.22        perf-profile.calltrace.cycles-pp.perf_iterate_sb.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64
     10.79            +0.3       11.09        perf-profile.calltrace.cycles-pp.perf_event_mmap.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     11.42            +0.3       11.74        perf-profile.calltrace.cycles-pp.__entry_text_start.brk
      3.99            +0.3        4.30        perf-profile.calltrace.cycles-pp.find_vma.__do_munmap.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe
     32.96            +0.6       33.61        perf-profile.calltrace.cycles-pp.do_brk_flags.__x64_sys_brk.do_syscall_64.entry_SYSCALL_64_after_hwframe.brk
     18.01            -2.3       15.74        perf-profile.children.cycles-pp.unmap_region
      2.79            -2.0        0.81 ±  5%  perf-profile.children.cycles-pp.lru_add_drain
     36.15            -1.6       34.55        perf-profile.children.cycles-pp.__do_munmap
     78.72            -0.6       78.09        perf-profile.children.cycles-pp.__x64_sys_brk
     81.86            -0.5       81.34        perf-profile.children.cycles-pp.do_syscall_64
     83.91            -0.5       83.46        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      7.59            -0.4        7.16        perf-profile.children.cycles-pp.unmap_page_range
      9.11            -0.4        8.73        perf-profile.children.cycles-pp.unmap_vmas
      4.12            -0.1        4.01        perf-profile.children.cycles-pp.zap_pte_range
      1.68            -0.1        1.61        perf-profile.children.cycles-pp.tlb_gather_mmu
      1.13            +0.0        1.16        perf-profile.children.cycles-pp.up_read
      0.78            +0.0        0.81        perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      0.46            +0.0        0.51        perf-profile.children.cycles-pp.tlb_flush_mmu
      0.61            +0.1        0.66        perf-profile.children.cycles-pp.sync_mm_rss
      1.63            +0.1        1.69        perf-profile.children.cycles-pp.syscall_exit_to_user_mode
      1.48            +0.1        1.56        perf-profile.children.cycles-pp.vmacache_find
      1.94            +0.1        2.03        perf-profile.children.cycles-pp.tlb_finish_mmu
      3.07            +0.1        3.19        perf-profile.children.cycles-pp.down_write_killable
      7.37            +0.2        7.58        perf-profile.children.cycles-pp.__entry_text_start
      6.18            +0.2        6.42        perf-profile.children.cycles-pp.perf_iterate_sb
     11.14            +0.3       11.43        perf-profile.children.cycles-pp.perf_event_mmap
      5.32            +0.4        5.68        perf-profile.children.cycles-pp.find_vma
     33.18            +0.7       33.86        perf-profile.children.cycles-pp.do_brk_flags
      2.85            -0.3        2.50        perf-profile.self.cycles-pp.unmap_page_range
      2.26            -0.1        2.13        perf-profile.self.cycles-pp.zap_pte_range
      1.62            -0.1        1.56        perf-profile.self.cycles-pp.tlb_gather_mmu
      0.64            +0.0        0.67        perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      0.72            +0.0        0.75        perf-profile.self.cycles-pp.syscall_enter_from_user_mode
      1.07 ±  2%      +0.0        1.11        perf-profile.self.cycles-pp.up_read
      1.47            +0.0        1.51        perf-profile.self.cycles-pp.tlb_finish_mmu
      1.43            +0.0        1.47        perf-profile.self.cycles-pp.downgrade_write
      0.56            +0.0        0.61        perf-profile.self.cycles-pp.sync_mm_rss
      0.28 ±  3%      +0.0        0.33 ±  2%  perf-profile.self.cycles-pp.tlb_flush_mmu
      2.09            +0.1        2.14        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      2.25            +0.1        2.31        perf-profile.self.cycles-pp.vm_area_alloc
      2.26            +0.1        2.34        perf-profile.self.cycles-pp.__x64_sys_brk
      3.46            +0.1        3.53        perf-profile.self.cycles-pp.do_brk_flags
      1.32            +0.1        1.40        perf-profile.self.cycles-pp.vmacache_find
      2.53            +0.1        2.62        perf-profile.self.cycles-pp.down_write_killable
      3.32            +0.1        3.42        perf-profile.self.cycles-pp.__entry_text_start
      3.21            +0.1        3.33 ±  2%  perf-profile.self.cycles-pp.kmem_cache_free
      2.87 ±  2%      +0.1        3.00        perf-profile.self.cycles-pp.kmem_cache_alloc
      4.05            +0.1        4.20        perf-profile.self.cycles-pp.__do_munmap
      5.21            +0.2        5.37        perf-profile.self.cycles-pp.brk
      4.16            +0.2        4.35 ±  2%  perf-profile.self.cycles-pp.perf_iterate_sb
      3.41            +0.3        3.67        perf-profile.self.cycles-pp.find_vma


                                                                                
                              will-it-scale.192.processes                       
                                                                                
  2.14e+08 +----------------------------------------------------------------+   
           |O OO   O          O   O   O O   O                               |   
  2.13e+08 |-+      O   O OOO  OO  OO      O                                |   
  2.12e+08 |-+    O   OO                  O                                 |   
           |    O                                                           |   
  2.11e+08 |-+                                                              |   
   2.1e+08 |-+                                                              |   
           |           +                                                    |   
  2.09e+08 |-+         :                                                    |   
  2.08e+08 |-+        : :                                                   |   
           |      ++ .+ +                             .+                +   |   
  2.07e+08 |-+   :  +    :    + +                   ++  ++.++    +.+++.+ :.+|   
  2.06e+08 |-+ + :       :   : ::+   .+  .+ +   + .+         +.++        +  |   
           |+.+ +         +  : +  +++  ++  + +.+ +                          |   
  2.05e+08 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                              will-it-scale.per_process_ops                     
                                                                                
  1.115e+06 +---------------------------------------------------------------+   
   1.11e+06 |O+OO   O         O   O   OO O   O                              |   
            |        O   OO OO  OO  OO     O                                |   
  1.105e+06 |-+    O  O O                 O                                 |   
    1.1e+06 |-+  O                                                          |   
            |                                                               |   
  1.095e+06 |-+                                                             |   
   1.09e+06 |-+         +                                                   |   
  1.085e+06 |-+        ::                                                   |   
            |          : :                                              +   |   
   1.08e+06 |-+    ++++  +       +                  +.+++.+++.   .+ +   ::  |   
  1.075e+06 |-+   +       :   +. ::   +     .+     +          +++  + +.+ +.+|   
            |+.+++        +.  : + +.  :+.+++  ++.++                         |   
   1.07e+06 |-+             ++      ++                                      |   
  1.065e+06 +---------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
  2.14e+08 +----------------------------------------------------------------+   
           |O OO   O          O   O   O O   O                               |   
  2.13e+08 |-+      O   O OOO  OO  OO      O                                |   
  2.12e+08 |-+    O   OO                  O                                 |   
           |    O                                                           |   
  2.11e+08 |-+                                                              |   
   2.1e+08 |-+                                                              |   
           |           +                                                    |   
  2.09e+08 |-+         :                                                    |   
  2.08e+08 |-+        : :                                                   |   
           |      ++ .+ +                             .+                +   |   
  2.07e+08 |-+   :  +    :    + +                   ++  ++.++    +.+++.+ :.+|   
  2.06e+08 |-+ + :       :   : ::+   .+  .+ +   + .+         +.++        +  |   
           |+.+ +         +  : +  +++  ++  + +.+ +                          |   
  2.05e+08 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.15.0-rc2-00169-g243418e3925d" of type "text/plain" (169046 bytes)

View attachment "job-script" of type "text/plain" (7897 bytes)

View attachment "job.yaml" of type "text/plain" (5286 bytes)

View attachment "reproduce" of type "text/plain" (337 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ