lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201030081456.GY31092@shao2-debian>
Date:   Fri, 30 Oct 2020 16:14:56 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Kent Overstreet <kent.overstreet@...il.com>
Cc:     linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
        Kent Overstreet <kent.overstreet@...il.com>, axboe@...nel.dk,
        willy@...radead.org, linux-fsdevel@...r.kernel.org,
        0day robot <lkp@...el.com>, lkp@...ts.01.org,
        ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com
Subject: [fs] 2b2f891180: stress-ng.sendfile.ops_per_sec 32.0% improvement

Greeting,

FYI, we noticed a 32.0% improvement of stress-ng.sendfile.ops_per_sec due to commit:


commit: 2b2f89118025e62137e4d1514866069b24d810a4 ("[PATCH v2 2/2] fs: generic_file_buffered_read() now uses find_get_pages_contig")
url: https://github.com/0day-ci/linux/commits/Kent-Overstreet/generic_file_buffered_read-improvements/20201026-053158
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 986b9eacb25910865b50e5f298aa8e2df7642f1b

in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 30s
	class: pipe
	cpufreq_governor: performance
	ucode: 0x5002f01






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
  pipe/gcc-9/performance/1HDD/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/stress-ng/30s/0x5002f01

commit: 
  aa5222a7e8 ("fs: Break generic_file_buffered_read up into multiple functions")
  2b2f891180 ("fs: generic_file_buffered_read() now uses find_get_pages_contig")

aa5222a7e8f6ab9e 2b2f89118025e62137e4d151486 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
           :4           25%           1:4     dmesg.WARNING:missing_R10_value_at__fsnotify_parent/0x
         %stddev     %change         %stddev
             \          |                \  
  1.38e+08 ±  8%     -21.4%  1.084e+08 ± 18%  stress-ng.pipe.ops
   4598964 ±  8%     -21.4%    3614448 ± 18%  stress-ng.pipe.ops_per_sec
  14543902           +32.0%   19194451        stress-ng.sendfile.ops
    484783           +32.0%     639800        stress-ng.sendfile.ops_per_sec
      9893           -20.7%       7844        stress-ng.time.maximum_resident_set_size
    259655 ±  3%     +11.0%     288128 ±  5%  cpuidle.POLL.time
    214144            -3.0%     207721 ±  2%  vmstat.system.in
     16311 ± 22%     -17.2%      13502 ± 23%  numa-meminfo.node0.KernelStack
     37461 ± 42%     -42.0%      21717 ± 60%  numa-meminfo.node0.PageTables
    147305 ± 16%     -17.5%     121457 ± 18%  numa-meminfo.node0.SUnreclaim
      2102 ±  5%      -9.5%       1901 ±  4%  slabinfo.PING.active_objs
      2102 ±  5%      -9.5%       1901 ±  4%  slabinfo.PING.num_objs
      3295 ±  4%      -6.9%       3069 ±  5%  slabinfo.sock_inode_cache.active_objs
      3295 ±  4%      -6.9%       3069 ±  5%  slabinfo.sock_inode_cache.num_objs
 2.303e+10            -4.0%   2.21e+10 ±  2%  perf-stat.i.branch-instructions
    209728            -4.5%     200298 ±  2%  perf-stat.i.cpu-migrations
   2110607            -8.5%    1932169 ±  2%  perf-stat.i.node-loads
      0.48            +0.0        0.52        perf-stat.overall.branch-miss-rate%
 2.261e+10            -3.8%  2.174e+10 ±  3%  perf-stat.ps.branch-instructions
   2109192            -8.2%    1936642 ±  2%  perf-stat.ps.node-loads
     24778 ±  2%      -5.4%      23434 ±  2%  proc-vmstat.nr_active_anon
     76250            +6.2%      81012 ±  9%  proc-vmstat.nr_anon_pages
     80059            +6.0%      84835 ±  8%  proc-vmstat.nr_inactive_anon
     28762 ±  2%      -4.6%      27434 ±  2%  proc-vmstat.nr_shmem
     24778 ±  2%      -5.4%      23434 ±  2%  proc-vmstat.nr_zone_active_anon
     80059            +6.0%      84835 ±  8%  proc-vmstat.nr_zone_inactive_anon
      5269           +38.8%       7313 ± 16%  sched_debug.cfs_rq:/.load.min
    274.69         +1285.1%       3804 ±142%  sched_debug.cfs_rq:/.load_avg.max
     46.74 ±  4%   +1059.0%     541.77 ±144%  sched_debug.cfs_rq:/.load_avg.stddev
      3.97 ± 26%     -45.5%       2.16 ± 57%  sched_debug.cfs_rq:/.removed.load_avg.avg
     31.04 ± 11%     -35.7%      19.95 ± 57%  sched_debug.cfs_rq:/.removed.load_avg.stddev
      1850 ±  3%      -9.8%       1669 ±  2%  sched_debug.cfs_rq:/.runnable_avg.max
    -65895          -277.0%     116623 ±123%  sched_debug.cfs_rq:/.spread0.avg
      1253 ±  2%     -11.9%       1105 ±  2%  sched_debug.cfs_rq:/.util_avg.max
    120.28 ± 11%     -20.8%      95.27 ± 12%  sched_debug.cfs_rq:/.util_avg.stddev
     90.38 ± 52%     +87.0%     169.00 ± 14%  sched_debug.cfs_rq:/.util_est_enqueued.min
     91.18            -0.8       90.33        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     93.72            -0.5       93.22        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.71 ± 17%      +0.2        0.91 ± 13%  perf-profile.calltrace.cycles-pp.common_file_perm.security_file_permission.vfs_read.ksys_read.do_syscall_64
      1.08 ± 15%      +0.3        1.36 ± 13%  perf-profile.calltrace.cycles-pp.security_file_permission.vfs_read.ksys_read.do_syscall_64.entry_SYSCALL_64_after_hwframe
     91.28            -0.8       90.44        perf-profile.children.cycles-pp.do_syscall_64
     93.81            -0.5       93.32        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      0.06 ± 20%      +0.0        0.08 ± 10%  perf-profile.children.cycles-pp.__x64_sys_write
      0.06 ± 22%      +0.0        0.08 ± 15%  perf-profile.children.cycles-pp.prepare_to_wait_event
      0.04 ± 63%      +0.0        0.07 ± 20%  perf-profile.children.cycles-pp.clockevents_program_event
      0.02 ±173%      +0.1        0.07 ± 22%  perf-profile.children.cycles-pp.ktime_get
      0.18 ± 14%      +0.1        0.28 ± 17%  perf-profile.children.cycles-pp.finish_task_switch
      0.53 ± 10%      +0.2        0.71 ± 14%  perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
      1.97 ±  9%      +0.2        2.20 ±  4%  perf-profile.children.cycles-pp.mutex_lock
      1.45 ± 18%      +0.3        1.77 ± 13%  perf-profile.children.cycles-pp.common_file_perm
      2.08 ± 16%      +0.5        2.53 ± 13%  perf-profile.children.cycles-pp.security_file_permission
      0.11 ± 22%      +0.0        0.16 ± 17%  perf-profile.self.cycles-pp.copyin
      0.03 ±100%      +0.0        0.07 ± 10%  perf-profile.self.cycles-pp.__x64_sys_write
      0.19 ± 19%      +0.0        0.24 ± 14%  perf-profile.self.cycles-pp.ksys_write
      0.36 ± 11%      +0.1        0.46 ± 15%  perf-profile.self.cycles-pp.security_file_permission
      0.63 ± 17%      +0.2        0.78 ± 12%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      1.16 ± 18%      +0.3        1.44 ± 13%  perf-profile.self.cycles-pp.common_file_perm
      0.00       +1.8e+104%     183.75 ±132%  interrupts.93:PCI-MSI.31981626-edge.i40e-eth0-TxRx-57
      7738 ±  2%     -37.7%       4819 ± 34%  interrupts.CPU21.NMI:Non-maskable_interrupts
      7738 ±  2%     -37.7%       4819 ± 34%  interrupts.CPU21.PMI:Performance_monitoring_interrupts
      7735 ±  2%     -37.5%       4834 ± 36%  interrupts.CPU23.NMI:Non-maskable_interrupts
      7735 ±  2%     -37.5%       4834 ± 36%  interrupts.CPU23.PMI:Performance_monitoring_interrupts
      4378 ± 23%    +335.6%      19072 ± 95%  interrupts.CPU42.CAL:Function_call_interrupts
      6507 ± 21%     +33.6%       8696 ± 22%  interrupts.CPU45.CAL:Function_call_interrupts
     17585 ± 34%     -53.7%       8147 ± 80%  interrupts.CPU51.CAL:Function_call_interrupts
     30252 ± 76%     -75.5%       7413 ± 67%  interrupts.CPU53.CAL:Function_call_interrupts
     22394 ± 68%     -72.6%       6125 ± 67%  interrupts.CPU55.CAL:Function_call_interrupts
     21262 ± 42%     -70.6%       6255 ± 70%  interrupts.CPU56.CAL:Function_call_interrupts
      9201 ± 83%    +111.2%      19434 ± 64%  interrupts.CPU56.RES:Rescheduling_interrupts
     20517 ± 57%     -58.0%       8627 ± 69%  interrupts.CPU58.CAL:Function_call_interrupts
     30292 ± 79%     -69.4%       9282 ±103%  interrupts.CPU59.CAL:Function_call_interrupts
     20349 ± 56%     -60.6%       8013 ± 91%  interrupts.CPU6.CAL:Function_call_interrupts
     21097 ± 55%     -63.7%       7660 ± 96%  interrupts.CPU61.CAL:Function_call_interrupts
     11855 ± 80%    +132.5%      27562 ± 65%  interrupts.CPU63.RES:Rescheduling_interrupts
     12953 ± 23%     -52.3%       6181 ± 75%  interrupts.CPU66.CAL:Function_call_interrupts
     19018 ± 38%     -65.0%       6660 ±102%  interrupts.CPU67.CAL:Function_call_interrupts
     26718 ± 70%     -66.5%       8941 ± 54%  interrupts.CPU7.CAL:Function_call_interrupts
     12561 ± 98%    +114.0%      26876 ± 72%  interrupts.CPU7.RES:Rescheduling_interrupts
     10208 ± 83%    +113.2%      21760 ± 60%  interrupts.CPU8.RES:Rescheduling_interrupts
      5057 ± 16%    +235.1%      16946 ±105%  interrupts.CPU87.CAL:Function_call_interrupts
      9429 ± 66%     +62.9%      15363 ± 43%  interrupts.CPU9.RES:Rescheduling_interrupts


                                                                                
                                stress-ng.sendfile.ops                          
                                                                                
    2e+07 +-----------------------------------------------------------------+   
          |O  OOO          O  O                                             |   
  1.9e+07 |-O    OO OOOOOOOOOO                                              |   
          |    O  OO                                                        |   
  1.8e+07 |-+O                                                              |   
          |          O                                                      |   
  1.7e+07 |-+                                                               |   
          |                                                                 |   
  1.6e+07 |-+                                                               |   
          |                                                    ++           |   
  1.5e+07 |-+     +  ++++ +++++++ +++++++ +++ ++++++++ +++++  +: +        + |   
          |++++++++++    ::   :  ++     + :+ ::  :: + +     + +   ++++++++++|   
  1.4e+07 |-+  +         +    +          +   +   +           +              |   
          |                                                                 |   
  1.3e+07 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                           stress-ng.sendfile.ops_per_sec                       
                                                                                
  660000 +------------------------------------------------------------------+   
  640000 |O+ OOO        OOOOOO                                              |   
         | O    OO OOOOOO  O                                                |   
  620000 |-+  O  OO                                                         |   
  600000 |-+O                                                               |   
         |           O                                                      |   
  580000 |-+                                                                |   
  560000 |-+                                                                |   
  540000 |-+                                                                |   
         |                                                                  |   
  520000 |-+                                                   ++           |   
  500000 |-+          + + +++ ++ +++++++ +  + +++ ++ +++++++ + :++          |   
         |++++++ +++++ +:+   :: + +     +::+:+   :: + +     +::   :+++ ++++ |   
  480000 |-+  + +       ::   +           :+ ::   +           :+   +   +    +|   
  460000 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.9.0-14770-g2b2f89118025" of type "text/plain" (171554 bytes)

View attachment "job-script" of type "text/plain" (8147 bytes)

View attachment "job.yaml" of type "text/plain" (5573 bytes)

View attachment "reproduce" of type "text/plain" (390 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ