lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 30 Dec 2020 11:27:53 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Kent Overstreet <kent.overstreet@...il.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Jens Axboe <axboe@...nel.dk>,
        Matthew Wilcox <willy@...radead.org>,
        kernel test robot <rong.a.chen@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [mm/filemap.c]  06c0444290:  stress-ng.sendfile.ops_per_sec 26.7%
 improvement


Greeting,

FYI, we noticed a 26.7% improvement of stress-ng.sendfile.ops_per_sec due to commit:


commit: 06c0444290cecf04c89c62e6d448b8461507d247 ("mm/filemap.c: generic_file_buffered_read() now uses find_get_pages_contig")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: stress-ng
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 192G memory
with following parameters:

	nr_threads: 100%
	disk: 1HDD
	testtime: 30s
	class: pipe
	cpufreq_governor: performance
	ucode: 0x5003003






Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
class/compiler/cpufreq_governor/disk/kconfig/nr_threads/rootfs/tbox_group/testcase/testtime/ucode:
  pipe/gcc-9/performance/1HDD/x86_64-rhel-8.3/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp5/stress-ng/30s/0x5003003

commit: 
  723ef24b9b ("mm/filemap/c: break generic_file_buffered_read up into multiple functions")
  06c0444290 ("mm/filemap.c: generic_file_buffered_read() now uses find_get_pages_contig")

723ef24b9b379e59 06c0444290cecf04c89c62e6d44 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  14865658           +26.7%   18839172 ±  2%  stress-ng.sendfile.ops
    495501           +26.7%     627957 ±  2%  stress-ng.sendfile.ops_per_sec
     17943 ± 12%     +36.5%      24500 ±  5%  proc-vmstat.numa_hint_faults
      1585 ±170%   +3019.7%      49447 ± 63%  numa-numastat.node0.other_node
     85104 ±  3%     -56.2%      37244 ± 83%  numa-numastat.node1.other_node
    169349 ± 19%     -21.8%     132475        meminfo.AnonHugePages
    301479 ± 10%     -11.8%     265842        meminfo.AnonPages
    333754 ±  9%     -10.6%     298511        meminfo.Inactive
    333754 ±  9%     -10.6%     298511        meminfo.Inactive(anon)
     11540 ± 12%     -15.2%       9785 ±  3%  sched_debug.cfs_rq:/.load.avg
     17531 ± 77%     -77.9%       3878 ±  9%  sched_debug.cfs_rq:/.load.stddev
     28103 ± 22%     +50.3%      42227 ± 30%  sched_debug.cpu.avg_idle.min
      6188 ± 20%     +38.9%       8595 ±  4%  sched_debug.cpu.curr->pid.min
    495.48 ± 24%     -46.2%     266.34 ± 18%  sched_debug.cpu.curr->pid.stddev
 3.336e+10            -5.6%  3.148e+10 ±  6%  perf-stat.i.branch-instructions
      0.03 ±  7%      +0.0        0.04 ± 43%  perf-stat.i.dTLB-load-miss-rate%
      0.01 ± 18%      +0.0        0.01 ± 10%  perf-stat.i.dTLB-store-miss-rate%
      6253 ±  3%     -18.2%       5117 ±  9%  perf-stat.i.instructions-per-iTLB-miss
      0.64            +0.0        0.67        perf-stat.overall.branch-miss-rate%
 3.264e+10            -5.5%  3.084e+10 ±  5%  perf-stat.ps.branch-instructions
      0.01 ±  5%     +13.5%       0.01 ±  2%  perf-sched.sch_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep.do_syscall_64
      0.05 ± 83%   +1464.2%       0.85 ±157%  perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
    494.00 ±  4%    +734.5%       4122 ±  3%  perf-sched.wait_and_delay.count.preempt_schedule_common._cond_resched.__splice_from_pipe.splice_from_pipe.direct_splice_actor
      5392 ±  3%     -98.4%      87.00 ± 10%  perf-sched.wait_and_delay.count.preempt_schedule_common._cond_resched.generic_file_buffered_read.generic_file_splice_read.splice_direct_to_actor
     18.80 ± 20%     -68.8%       5.86 ± 73%  perf-sched.wait_and_delay.max.ms.preempt_schedule_common._cond_resched.generic_file_buffered_read.generic_file_splice_read.splice_direct_to_actor
      0.59 ± 24%     -45.0%       0.32 ± 22%  perf-sched.wait_time.avg.ms.wait_for_partner.fifo_open.do_dentry_open.path_openat
     18.80 ± 20%     -68.8%       5.86 ± 73%  perf-sched.wait_time.max.ms.preempt_schedule_common._cond_resched.generic_file_buffered_read.generic_file_splice_read.splice_direct_to_actor
      7626           -38.2%       4716 ± 33%  interrupts.CPU30.NMI:Non-maskable_interrupts
      7626           -38.2%       4716 ± 33%  interrupts.CPU30.PMI:Performance_monitoring_interrupts
     18521 ± 44%    +241.7%      63289 ± 80%  interrupts.CPU42.RES:Rescheduling_interrupts
     30879 ± 63%     +91.0%      58971 ± 31%  interrupts.CPU49.RES:Rescheduling_interrupts
     37970 ± 19%    +115.4%      81806 ± 33%  interrupts.CPU5.CAL:Function_call_interrupts
     48131 ± 35%     -55.7%      21307 ± 23%  interrupts.CPU65.RES:Rescheduling_interrupts
     33689 ± 39%    +186.7%      96598 ± 88%  interrupts.CPU7.CAL:Function_call_interrupts
     37234 ± 52%     +76.5%      65709 ± 45%  interrupts.CPU71.CAL:Function_call_interrupts
     22154 ± 18%    +126.8%      50249 ± 70%  interrupts.CPU82.RES:Rescheduling_interrupts
     16632 ± 60%    +310.7%      68311 ± 51%  interrupts.CPU9.CAL:Function_call_interrupts
     17920 ± 45%    +264.1%      65238 ± 48%  interrupts.CPU95.CAL:Function_call_interrupts
     83.19            -3.7       79.44 ±  5%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     83.28            -3.8       79.53 ±  5%  perf-profile.children.cycles-pp.do_syscall_64
      0.68 ±  3%      +0.0        0.71 ±  2%  perf-profile.children.cycles-pp.sched_clock
      0.32 ± 12%      +0.1        0.39 ±  6%  perf-profile.children.cycles-pp.set_next_buddy
      0.00            +0.1        0.07 ± 12%  perf-profile.children.cycles-pp.perf_mux_hrtimer_handler
      1.31            +0.1        1.39 ±  3%  perf-profile.children.cycles-pp.__update_load_avg_se
      3.41            +0.2        3.56 ±  2%  perf-profile.children.cycles-pp.switch_mm_irqs_off
      0.08            +0.0        0.10 ±  5%  perf-profile.self.cycles-pp.perf_trace_run_bpf_submit
      0.08 ± 10%      +0.0        0.10 ±  4%  perf-profile.self.cycles-pp.__bitmap_and
      0.28 ±  9%      +0.1        0.33 ±  9%  perf-profile.self.cycles-pp.pipe_poll
      0.93            +0.1        0.99        perf-profile.self.cycles-pp.switch_mm_irqs_off
      0.29 ± 13%      +0.1        0.36 ±  7%  perf-profile.self.cycles-pp.set_next_buddy
      1.28            +0.1        1.36 ±  3%  perf-profile.self.cycles-pp.__update_load_avg_se
      9207 ± 21%     -40.3%       5500 ± 26%  softirqs.CPU10.SCHED
      8966 ± 22%     -36.9%       5653 ± 18%  softirqs.CPU11.SCHED
      9107 ± 21%     -41.4%       5341 ± 19%  softirqs.CPU12.SCHED
      8925 ± 23%     -41.6%       5215 ± 14%  softirqs.CPU13.SCHED
      9040 ± 21%     -41.5%       5285 ± 16%  softirqs.CPU14.SCHED
      8972 ± 23%     -40.6%       5327 ± 16%  softirqs.CPU15.SCHED
      9021 ± 22%     -41.5%       5279 ± 13%  softirqs.CPU16.SCHED
      8932 ± 20%     -40.0%       5357 ± 15%  softirqs.CPU17.SCHED
      8870 ± 21%     -39.0%       5409 ± 20%  softirqs.CPU18.SCHED
      8866 ± 23%     -39.9%       5330 ± 19%  softirqs.CPU19.SCHED
      9042 ± 21%     -37.1%       5683 ± 16%  softirqs.CPU2.SCHED
      8898 ± 22%     -40.7%       5274 ± 20%  softirqs.CPU20.SCHED
      8989 ± 22%     -39.8%       5412 ± 18%  softirqs.CPU21.SCHED
      8876 ± 22%     -41.2%       5223 ± 19%  softirqs.CPU22.SCHED
      8892 ± 19%     -38.7%       5455 ± 16%  softirqs.CPU23.SCHED
      6924 ± 31%     +58.8%      10997 ±  7%  softirqs.CPU24.SCHED
      7098 ± 30%     +64.3%      11663 ±  6%  softirqs.CPU25.SCHED
      7040 ± 32%     +63.8%      11534 ±  7%  softirqs.CPU26.SCHED
      6977 ± 32%     +63.6%      11416 ±  7%  softirqs.CPU27.SCHED
      7088 ± 30%     +58.8%      11255 ±  8%  softirqs.CPU28.SCHED
      6857 ± 33%     +65.5%      11352 ±  6%  softirqs.CPU29.SCHED
      9142 ± 25%     -41.4%       5358 ± 19%  softirqs.CPU3.SCHED
      7061 ± 29%     +62.5%      11472 ±  7%  softirqs.CPU30.SCHED
      6878 ± 30%     +65.2%      11362 ±  8%  softirqs.CPU31.SCHED
      7173 ± 30%     +64.9%      11828 ±  5%  softirqs.CPU32.SCHED
      7013 ± 31%     +62.6%      11405 ±  8%  softirqs.CPU33.SCHED
     14139 ± 27%     +30.8%      18492 ± 24%  softirqs.CPU34.RCU
      7033 ± 32%     +58.8%      11166 ±  7%  softirqs.CPU34.SCHED
      6963 ± 29%     +61.5%      11248 ±  7%  softirqs.CPU35.SCHED
      7012 ± 30%     +61.6%      11332 ±  9%  softirqs.CPU36.SCHED
      6923 ± 29%     +63.4%      11310 ±  8%  softirqs.CPU37.SCHED
      7070 ± 32%     +59.8%      11298 ±  8%  softirqs.CPU38.SCHED
      6818 ± 31%     +67.0%      11389 ±  7%  softirqs.CPU39.SCHED
      9088 ± 20%     -42.2%       5250 ± 15%  softirqs.CPU4.SCHED
      7040 ± 29%     +61.5%      11368 ±  7%  softirqs.CPU40.SCHED
      6980 ± 29%     +62.2%      11321 ±  7%  softirqs.CPU41.SCHED
      6926 ± 29%     +64.3%      11379 ±  8%  softirqs.CPU42.SCHED
      7062 ± 30%     +57.4%      11114 ±  7%  softirqs.CPU43.SCHED
      6960 ± 31%     +62.1%      11279 ±  9%  softirqs.CPU44.SCHED
      6854 ± 31%     +65.0%      11310 ±  7%  softirqs.CPU45.SCHED
      7152 ± 29%     +58.9%      11362 ±  9%  softirqs.CPU46.SCHED
      6828 ± 31%     +64.2%      11210 ±  8%  softirqs.CPU47.SCHED
      8903 ± 21%     -41.9%       5170 ± 17%  softirqs.CPU48.SCHED
      9048 ± 21%     -41.0%       5336 ± 19%  softirqs.CPU49.SCHED
      8974 ± 23%     -42.9%       5124 ± 18%  softirqs.CPU5.SCHED
      8742 ± 23%     -40.8%       5177 ± 17%  softirqs.CPU50.SCHED
      8647 ± 23%     -38.3%       5335 ± 19%  softirqs.CPU51.SCHED
      8783 ± 22%     -41.7%       5118 ± 10%  softirqs.CPU52.SCHED
      8659 ± 26%     -40.2%       5175 ± 18%  softirqs.CPU53.SCHED
      8852 ± 23%     -40.2%       5292 ± 15%  softirqs.CPU54.SCHED
      9070 ± 20%     -43.2%       5153 ± 16%  softirqs.CPU55.SCHED
      9191 ± 17%     -42.7%       5266 ± 15%  softirqs.CPU56.SCHED
      8884 ± 24%     -41.8%       5171 ± 16%  softirqs.CPU57.SCHED
      8986 ± 23%     -40.5%       5344 ± 18%  softirqs.CPU58.SCHED
      9501 ± 24%     -44.9%       5233 ± 20%  softirqs.CPU59.SCHED
      8897 ± 21%     -38.5%       5467 ± 18%  softirqs.CPU6.SCHED
      9260 ± 18%     -42.4%       5335 ± 22%  softirqs.CPU60.SCHED
      8966 ± 21%     -42.3%       5170 ± 15%  softirqs.CPU61.SCHED
      8963 ± 22%     -39.2%       5454 ± 16%  softirqs.CPU62.SCHED
      8948 ± 22%     -39.0%       5462 ± 17%  softirqs.CPU63.SCHED
      8980 ± 22%     -41.1%       5289 ± 16%  softirqs.CPU64.SCHED
      8969 ± 20%     -41.4%       5260 ± 15%  softirqs.CPU65.SCHED
      8891 ± 23%     -41.4%       5211 ± 20%  softirqs.CPU66.SCHED
      9193 ± 23%     -40.0%       5520 ± 14%  softirqs.CPU67.SCHED
      8936 ± 24%     -40.7%       5296 ± 16%  softirqs.CPU68.SCHED
      8871 ± 22%     -40.0%       5320 ± 16%  softirqs.CPU69.SCHED
      8962 ± 19%     -41.8%       5216 ± 17%  softirqs.CPU7.SCHED
      8671 ± 24%     -38.7%       5315 ± 19%  softirqs.CPU70.SCHED
      7198 ± 29%     +53.9%      11076 ±  4%  softirqs.CPU72.SCHED
      7133 ± 29%     +61.0%      11488 ±  9%  softirqs.CPU73.SCHED
      6952 ± 30%     +66.9%      11602 ±  8%  softirqs.CPU74.SCHED
      6975 ± 31%     +60.8%      11214 ±  7%  softirqs.CPU75.SCHED
      6985 ± 31%     +58.4%      11065 ± 10%  softirqs.CPU76.SCHED
      6811 ± 31%     +63.6%      11146 ±  6%  softirqs.CPU77.SCHED
      7006 ± 29%     +62.0%      11347 ±  7%  softirqs.CPU78.SCHED
      6827 ± 32%     +65.8%      11316 ±  9%  softirqs.CPU79.SCHED
      8957 ± 19%     -40.8%       5304 ± 18%  softirqs.CPU8.SCHED
      7102 ± 32%     +59.7%      11345 ±  8%  softirqs.CPU80.SCHED
      7023 ± 30%     +60.3%      11258 ±  8%  softirqs.CPU81.SCHED
      7046 ± 31%     +57.8%      11121 ±  6%  softirqs.CPU82.SCHED
      6966 ± 30%     +57.1%      10941 ±  8%  softirqs.CPU83.SCHED
      6953 ± 30%     +62.0%      11261 ± 10%  softirqs.CPU85.SCHED
      6884 ± 31%     +63.0%      11220 ±  9%  softirqs.CPU86.SCHED
      6765 ± 32%     +66.3%      11249 ±  8%  softirqs.CPU87.SCHED
      6963 ± 29%     +63.8%      11403 ±  7%  softirqs.CPU88.SCHED
      6869 ± 31%     +63.6%      11241 ±  8%  softirqs.CPU89.SCHED
      9002 ± 21%     -43.2%       5115 ± 18%  softirqs.CPU9.SCHED
      6759 ± 33%     +69.4%      11450 ±  9%  softirqs.CPU90.SCHED
      7003 ± 30%     +58.9%      11130 ±  9%  softirqs.CPU91.SCHED
      6994 ± 28%     +64.9%      11533 ±  9%  softirqs.CPU92.SCHED
      6839 ± 32%     +66.1%      11358 ±  6%  softirqs.CPU93.SCHED
      7121 ± 29%     +59.2%      11339 ±  7%  softirqs.CPU94.SCHED
      6798 ± 32%     +67.9%      11412 ±  7%  softirqs.CPU95.SCHED


                                                                                
                                stress-ng.sendfile.ops                          
                                                                                
    2e+07 +-----------------------------------------------------------------+   
          |     O OO OO                                                     |   
  1.9e+07 |-+O OOOOOOO OO OOOO                                              |   
          |OOOO  O  O  OOO O                                                |   
          |   O  O  O        O                                              |   
  1.8e+07 |-+                                                               |   
          |                                                                 |   
  1.7e+07 |-+                                                               |   
          |                                                                 |   
  1.6e+07 |-+                                                               |   
          |                                                                 |   
          |                                                                 |   
  1.5e+07 |+++++ + ++++++++++ +++++++++++ ++++++  +++++++ ++++++++++ + +++++|   
          |+   ++++ + +++  +++++ + ++  +++++ ++++++++  +++++ + +   ++++  +  |   
  1.4e+07 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                           stress-ng.sendfile.ops_per_sec                       
                                                                                
  660000 +------------------------------------------------------------------+   
  640000 |-+  OO OOOOO     O                                                |   
         |OOOOOOOO O  OOOOOOO                                               |   
  620000 |-+OO      O                                                       |   
  600000 |-+ O  O           O                                               |   
         |                                                                  |   
  580000 |-+                                                                |   
  560000 |-+                                                                |   
  540000 |-+                                                                |   
         |                                                                  |   
  520000 |-+                                                                |   
  500000 |-+      +            +           +               +           +    |   
         |++++++++++++++++++++++++++++++++++++++  ++++++++++++++++++++++++++|   
  480000 |-+  ++ +  +      ++ + +      +  + + +++++    ++ +  +     ++ + +   |   
  460000 +------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                                                                
                                                                                
  7000 +--------------------------------------------------------------------+   
       |                                                                    |   
  6000 |-+    +            +           +     +          +           +       |   
       |++++++:++++++++++++:+++++++++++::++++:+ + ++++++:+++++++++++::++++++|   
  5000 |-+  ++++ + ++    + ++ + +    + ++ + ++++ +    + ++ + +    + ++ + ++ |   
       |                                                                    |   
  4000 |-+                                                                  |   
       |                                                                    |   
  3000 |-+                                                                  |   
       |                                                                    |   
  2000 |-+                                                                  |   
       |                                                                    |   
  1000 |-+                                                                  |   
       |                                                                    |   
     0 +--------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Oliver Sang


View attachment "config-5.10.0-g06c0444290ce" of type "text/plain" (171082 bytes)

View attachment "job-script" of type "text/plain" (7928 bytes)

View attachment "job.yaml" of type "text/plain" (5526 bytes)

View attachment "reproduce" of type "text/plain" (390 bytes)

Powered by blists - more mailing lists