lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Thu, 18 Mar 2021 23:02:46 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Christoph Hellwig <hch@....de>,
        Daniel Vetter <daniel@...ll.ch>,
        David Airlie <airlied@...ux.ie>,
        David Gibson <david@...son.dropbear.id.au>,
        Gal Pressman <galpress@...zon.com>, Jan Kara <jack@...e.cz>,
        Jann Horn <jannh@...gle.com>,
        Kirill Shutemov <kirill@...temov.name>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        Matthew Wilcox <willy@...radead.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Roland Scheidegger <sroland@...are.com>,
        VMware Graphics <linux-graphics-maintainer@...are.com>,
        Wei Zhang <wzam@...zon.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [hugetlb]  4eae4efa2c:  vm-scalability.throughput 1.1% improvement



Greeting,

FYI, we noticed a 1.1% improvement of vm-scalability.throughput due to commit:


commit: 4eae4efa2c299f85b7ebfbeeda56c19c5eba2768 ("hugetlb: do early cow when page pinned on src mm")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: vm-scalability
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	runtime: 300s
	size: 8T
	test: anon-cow-seq-hugetlb
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/8T/lkp-csl-2sp6/anon-cow-seq-hugetlb/vm-scalability/0x5003006

commit: 
  ca6eb14d64 ("mm: use is_cow_mapping() across tree where proper")
  4eae4efa2c ("hugetlb: do early cow when page pinned on src mm")

ca6eb14d6453bea8 4eae4efa2c299f85b7ebfbeeda5 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.93 ±  9%      +0.1        1.06 ±  5%  vm-scalability.stddev%
  36639545            +1.1%   37036215        vm-scalability.throughput
     90908 ±  4%      -4.8%      86547        vm-scalability.time.involuntary_context_switches
     12834            -1.2%      12676        vm-scalability.time.system_time
      7026 ± 36%    +119.0%      15385 ± 62%  softirqs.NET_RX
     68.53 ± 18%     -31.0%      47.25 ± 14%  sched_debug.cpu.nr_uninterruptible.max
    -28.19           +23.9%     -34.94        sched_debug.cpu.nr_uninterruptible.min
   3928701            -1.0%    3888371        proc-vmstat.htlb_buddy_alloc_success
 2.013e+09            -1.0%  1.992e+09        proc-vmstat.pgalloc_normal
 2.011e+09            -0.9%  1.992e+09        proc-vmstat.pgfree
      3069 ± 18%     -12.2%       2694 ±  4%  interrupts.CPU32.CAL:Function_call_interrupts
    264.17 ± 41%     -33.6%     175.33 ±  8%  interrupts.CPU74.RES:Rescheduling_interrupts
    264.50 ± 22%     -33.1%     177.00 ± 18%  interrupts.CPU77.RES:Rescheduling_interrupts
    246.00 ± 28%     -33.6%     163.33 ± 16%  interrupts.CPU82.RES:Rescheduling_interrupts
    219.67 ± 20%     -24.4%     166.17 ± 14%  interrupts.CPU90.RES:Rescheduling_interrupts
 1.937e+11            -1.0%  1.917e+11        perf-stat.i.cpu-cycles
 1.045e+08            -2.2%  1.022e+08        perf-stat.i.node-load-misses
      0.92 ± 17%      +0.6        1.54 ± 10%  perf-stat.i.node-store-miss-rate%
    409065 ± 16%     +88.8%     772315 ± 11%  perf-stat.i.node-store-misses
      0.69 ± 18%      +0.6        1.30 ±  9%  perf-stat.overall.node-store-miss-rate%
 1.045e+08            -2.2%  1.023e+08        perf-stat.ps.node-load-misses
    408265 ± 17%     +87.9%     767022 ± 11%  perf-stat.ps.node-store-misses
      0.46 ± 39%     -56.4%       0.20 ± 60%  perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
      2578 ±  3%     -19.8%       2067 ± 10%  perf-sched.wait_and_delay.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
      2582 ±  3%     -20.0%       2067 ± 10%  perf-sched.wait_and_delay.max.ms.do_syslog.part.0.kmsg_read.vfs_read
      2412 ±  3%     -19.9%       1931 ± 11%  perf-sched.wait_and_delay.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
      2568 ±  3%     -19.9%       2057 ± 10%  perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
      2362 ±  4%     -23.6%       1804 ± 12%  perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      2667 ±  3%     -18.0%       2187 ±  9%  perf-sched.wait_and_delay.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
      2.80 ±186%     -95.0%       0.14 ± 79%  perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra
      2578 ±  3%     -19.8%       2067 ± 10%  perf-sched.wait_time.max.ms.devkmsg_read.vfs_read.ksys_read.do_syscall_64
      2580 ±  3%     -19.9%       2067 ± 10%  perf-sched.wait_time.max.ms.do_syslog.part.0.kmsg_read.vfs_read
      2412 ±  3%     -19.9%       1931 ± 11%  perf-sched.wait_time.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
      2568 ±  3%     -19.9%       2057 ± 10%  perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe.[unknown]
      2362 ±  4%     -23.6%       1804 ± 12%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait.__x64_sys_epoll_wait
      2667 ±  3%     -18.0%       2187 ±  9%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop.0.do_select
    176.12 ±209%     -96.6%       5.98 ± 64%  perf-sched.wait_time.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open.isra


                                                                                
                               vm-scalability.throughput                        
                                                                                
  3.72e+07 +----------------------------------------------------------------+   
           |                                                                |   
  3.71e+07 |-+                   O     O                                    |   
           |   O       O   O             O  O                               |   
   3.7e+07 |-+O  O   O                O      OO                             |   
           |      O O              O                                        |   
  3.69e+07 |-+          O O  OO O   O     O                                 |   
           |                                                                |   
  3.68e+07 |-+                                                    +         |   
           |         +.                                          ::         |   
  3.67e+07 |-+       : +  +                                      : :.+      |   
           |+.     .+   :+ +       +   +   .+       +.       +.++  +  +.++. |   
  3.66e+07 |-+++.++     +   :     : :.+ + +  ++.++.+  ++.++.+              +|   
           |                : +.+ : +    +                                  |   
  3.65e+07 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc2-00348-g4eae4efa2c29" of type "text/plain" (172899 bytes)

View attachment "job-script" of type "text/plain" (8144 bytes)

View attachment "job.yaml" of type "text/plain" (5551 bytes)

View attachment "reproduce" of type "text/plain" (6686 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ