lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200721005609.GF19262@shao2-debian>
Date:   Tue, 21 Jul 2020 08:56:09 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Michal Hocko <mhocko@...nel.org>,
        Hugh Dickins <hughd@...gle.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Prakash Sangappa <prakash.sangappa@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        kernel test robot <rong.a.chen@...el.com>, lkp@...ts.01.org
Subject: [hugetlbfs] 878308e2e0: vm-scalability.throughput 1.2% improvement

Greeting,

FYI, we noticed a 1.2% improvement of vm-scalability.throughput due to commit:


commit: 878308e2e0d003e923a0fad51657441916ca1a86 ("[RFC PATCH 2/3] hugetlbfs: Only take i_mmap_rwsem when sharing is possible")
url: https://github.com/0day-ci/linux/commits/Mike-Kravetz/hugetlbfs-address-fault-time-regression/20200707-043055
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 9ebcfadb0610322ac537dd7aa5d9cbc2b2894c68

in testcase: vm-scalability
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	runtime: 300s
	size: 8T
	test: anon-cow-seq-hugetlb
	cpufreq_governor: performance
	ucode: 0x5002f01

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/size/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/8T/lkp-csl-2sp6/anon-cow-seq-hugetlb/vm-scalability/0x5002f01

commit: 
  e1f9bcc75b ("Revert: "hugetlbfs: Use i_mmap_rwsem to address page fault/truncate race"")
  878308e2e0 ("hugetlbfs: Only take i_mmap_rwsem when sharing is possible")

e1f9bcc75b135fa7 878308e2e0d003e923a0fad5165 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    368149            +1.2%     372740        vm-scalability.median
      2.08 ±  6%      +1.1        3.21 ±  6%  vm-scalability.median_stddev%
  36859123            +1.2%   37316599        vm-scalability.throughput
      7183            +2.3%       7351        vm-scalability.time.percent_of_cpu_this_job_got
     13255            +1.3%      13426        vm-scalability.time.system_time
      8653            +2.9%       8904        vm-scalability.time.user_time
    353403           -87.2%      45099        vm-scalability.time.voluntary_context_switches
      2215            -2.0%       2170        boot-time.idle
  24601711 ± 19%     +27.9%   31454336 ±  9%  meminfo.DirectMap2M
     58850 ±  2%     -10.7%      52572 ±  2%  cpuidle.C1E.usage
   1542818           -15.3%    1306149        cpuidle.C6.usage
      3422 ± 11%     -13.6%       2957 ±  7%  slabinfo.fsnotify_mark_connector.active_objs
      3422 ± 11%     -13.6%       2957 ±  7%  slabinfo.fsnotify_mark_connector.num_objs
     13429 ± 10%     -19.8%      10769 ±  4%  softirqs.CPU0.SCHED
    545104           -26.3%     401941        softirqs.SCHED
     24.00            -8.3%      22.00        vmstat.cpu.id
     29.00            +3.4%      30.00        vmstat.cpu.us
      4618           -43.5%       2609 ±  2%  vmstat.system.cs
     76078            +2.7%      78129        vmstat.system.in
    580.50 ± 17%     -33.2%     387.75 ±  9%  interrupts.CPU15.CAL:Function_call_interrupts
    519.75 ± 20%     -25.5%     387.25 ± 11%  interrupts.CPU19.CAL:Function_call_interrupts
    163.50 ± 77%     -90.1%      16.25 ±100%  interrupts.CPU24.TLB:TLB_shootdowns
     69.00 ± 10%     +40.9%      97.25 ± 16%  interrupts.CPU25.RES:Rescheduling_interrupts
    727.75 ± 41%     -42.3%     420.00 ± 28%  interrupts.CPU31.CAL:Function_call_interrupts
    159.25 ± 61%     -57.9%      67.00 ± 32%  interrupts.CPU31.RES:Rescheduling_interrupts
      9.25 ±142%    +518.9%      57.25 ± 61%  interrupts.CPU39.TLB:TLB_shootdowns
    746.50 ±  8%     -25.6%     555.25 ± 10%  interrupts.CPU7.CAL:Function_call_interrupts
     99.50 ± 18%     -25.1%      74.50 ± 28%  interrupts.CPU73.RES:Rescheduling_interrupts
    208.25 ± 36%     -68.3%      66.00 ± 44%  interrupts.CPU82.RES:Rescheduling_interrupts
    612.75 ± 34%     -35.4%     395.75 ± 22%  interrupts.CPU88.CAL:Function_call_interrupts
    509479 ± 44%     -52.4%     242707 ± 32%  sched_debug.cfs_rq:/.load.max
   9228443 ± 10%     +23.1%   11358951        sched_debug.cfs_rq:/.min_vruntime.max
    538248 ± 13%     +60.0%     861238 ± 15%  sched_debug.cfs_rq:/.min_vruntime.stddev
     18.97 ±  9%     +33.8%      25.40 ±  6%  sched_debug.cfs_rq:/.nr_spread_over.avg
     88.65 ± 33%     +64.1%     145.50 ± 26%  sched_debug.cfs_rq:/.nr_spread_over.max
     18.49 ± 13%     +45.9%      26.98 ± 19%  sched_debug.cfs_rq:/.nr_spread_over.stddev
    223.44 ± 11%     +19.6%     267.14 ±  4%  sched_debug.cfs_rq:/.runnable_avg.stddev
    701309 ± 21%     +83.4%    1286114 ± 13%  sched_debug.cfs_rq:/.spread0.max
    538445 ± 13%     +59.9%     861184 ± 15%  sched_debug.cfs_rq:/.spread0.stddev
    222.23 ± 11%     +17.7%     261.61 ±  5%  sched_debug.cfs_rq:/.util_avg.stddev
    296.77 ±  5%     +22.6%     363.91 ±  9%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
      2299 ±100%    +404.8%      11607 ± 52%  sched_debug.cpu.max_idle_balance_cost.stddev
      0.27 ±  6%     +27.7%       0.35 ± 23%  sched_debug.cpu.nr_running.stddev
      7774 ±  9%     -30.7%       5387 ±  2%  sched_debug.cpu.nr_switches.avg
      5083 ± 10%     -61.2%       1972 ±  8%  sched_debug.cpu.nr_switches.min
      0.02 ± 53%    +159.3%       0.05 ± 41%  sched_debug.cpu.nr_uninterruptible.avg
      6771 ± 11%     -34.5%       4435 ±  2%  sched_debug.cpu.sched_count.avg
      4287 ± 13%     -72.0%       1201 ± 10%  sched_debug.cpu.sched_count.min
      3032 ± 14%     +23.7%       3752 ±  4%  sched_debug.cpu.sched_count.stddev
      3015 ± 11%     -42.6%       1731 ±  2%  sched_debug.cpu.sched_goidle.avg
      1853 ± 13%     -82.3%     328.67 ± 18%  sched_debug.cpu.sched_goidle.min
      3128 ± 11%     -37.8%       1944 ±  3%  sched_debug.cpu.ttwu_count.avg
      1007 ± 12%     -35.6%     648.58 ± 10%  sched_debug.cpu.ttwu_count.min
    783.48 ± 10%     +19.2%     933.93 ±  6%  sched_debug.cpu.ttwu_local.avg


                                                                                
                               vm-scalability.throughput                        
                                                                                
  3.76e+07 +----------------------------------------------------------------+   
           |             O                                                  |   
  3.74e+07 |-+O    O       O  O  O O  O  O    O     O    O    O  O  O       |   
           |    O     O                     O    O    O     O          O    |   
  3.72e+07 |-+                     +..               .+..                   |   
   3.7e+07 |-+          .+        +     .+..+.  .+..+    +.. .+..           |   
           |          +.  +  .+..+    +.      +.            +    +..+..+.+..|   
  3.68e+07 |-+  +.. ..     +.                                               |   
           |    :  +                                                        |   
  3.66e+07 |-+ :                                                            |   
  3.64e+07 |-+ :                                                            |   
           |   :                                                            |   
  3.62e+07 |..:                                                             |   
           |  +                                                             |   
   3.6e+07 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                   vm-scalability.time.voluntary_context_switches               
                                                                                
  400000 +------------------------------------------------------------------+   
         |    .+.     .+..                                                  |   
  350000 |..+.   +..+.    +..+.+..+..+..+..+.+..+..+..+.+..+..+..+..+.+..+..|   
         |                                                                  |   
  300000 |-+                                                                |   
  250000 |-+                                                                |   
         |                                                                  |   
  200000 |-+                                                                |   
         |                                                                  |   
  150000 |-+                                                                |   
  100000 |-+                                                                |   
         |                                                                  |   
   50000 |-+        O     O       O  O  O  O O  O  O  O O  O  O  O  O O     |   
         |  O  O O     O     O O                                            |   
       0 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.8.0-rc3-00002-g878308e2e0d00" of type "text/plain" (158415 bytes)

View attachment "job-script" of type "text/plain" (7910 bytes)

View attachment "job.yaml" of type "text/plain" (5403 bytes)

View attachment "reproduce" of type "text/plain" (6784 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ