lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 3 Dec 2020 14:45:36 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     David Howells <dhowells@...hat.com>
Cc:     lkp@...ts.01.org, lkp@...el.com, ying.huang@...el.com,
        feng.tang@...el.com, zhengjun.xing@...el.com,
        Pavel Begunkov <asml.silence@...il.com>,
        Matthew Wilcox <willy@...radead.org>,
        Jens Axboe <axboe@...nel.dk>,
        Alexander Viro <viro@...iv.linux.org.uk>, dhowells@...hat.com,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-fsdevel@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: [iov_iter]  9bd0e337c6:  will-it-scale.per_process_ops -4.8%
 regression


Greeting,

FYI, we noticed a -4.8% regression of will-it-scale.per_process_ops due to commit:


commit: 9bd0e337c633aed3e8ec3c7397b7ae0b8436f163 ("[PATCH 01/29] iov_iter: Switch to using a table of operations")
url: https://github.com/0day-ci/linux/commits/David-Howells/RFC-iov_iter-Switch-to-using-an-ops-table/20201121-222344
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 27bba9c532a8d21050b94224ffd310ad0058c353

in testcase: will-it-scale
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 112G memory
with following parameters:

	nr_task: 50%
	mode: process
	test: pwrite1
	cpufreq_governor: performance
	ucode: 0x42e

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/pwrite1/will-it-scale/0x42e

commit: 
  27bba9c532 ("Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi")
  9bd0e337c6 ("iov_iter: Switch to using a table of operations")

27bba9c532a8d210 9bd0e337c633aed3e8ec3c7397b 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  28443113            -4.8%   27064036        will-it-scale.24.processes
   1185129            -4.8%    1127667        will-it-scale.per_process_ops
  28443113            -4.8%   27064036        will-it-scale.workload
     13.84            +1.0%      13.98        boot-time.dhcp
      0.00 ±  9%     -13.5%       0.00 ±  3%  sched_debug.cpu.next_balance.stddev
      1251 ±  9%     -17.2%       1035 ± 10%  slabinfo.dmaengine-unmap-16.active_objs
      1251 ±  9%     -17.2%       1035 ± 10%  slabinfo.dmaengine-unmap-16.num_objs
     24623 ±  5%     -18.0%      20184 ± 15%  softirqs.CPU0.RCU
     28877 ± 10%     -30.6%      20051 ± 15%  softirqs.CPU19.RCU
      5693 ± 31%    +402.3%      28595 ± 22%  softirqs.CPU19.SCHED
     21142 ± 15%     -26.5%      15533 ± 11%  softirqs.CPU27.RCU
     20776 ± 38%     -50.5%      10290 ± 58%  softirqs.CPU3.SCHED
     26618 ± 11%     -35.3%      17214 ±  6%  softirqs.CPU37.RCU
     10894 ± 48%    +175.5%      30012 ± 34%  softirqs.CPU37.SCHED
     17015 ±  4%     +39.2%      23681 ±  7%  softirqs.CPU43.RCU
    411.75 ± 58%     +76.8%     728.00 ± 32%  numa-vmstat.node0.nr_active_anon
     34304 ±  2%     -35.6%      22103 ± 48%  numa-vmstat.node0.nr_anon_pages
     36087 ±  2%     -31.0%      24915 ± 43%  numa-vmstat.node0.nr_inactive_anon
      2233 ± 51%     +60.4%       3582 ±  7%  numa-vmstat.node0.nr_shmem
    411.75 ± 58%     +76.8%     728.00 ± 32%  numa-vmstat.node0.nr_zone_active_anon
     36087 ±  2%     -31.0%      24915 ± 43%  numa-vmstat.node0.nr_zone_inactive_anon
     24265 ±  3%     +51.3%      36707 ± 29%  numa-vmstat.node1.nr_anon_pages
     25441 ±  2%     +44.9%      36858 ± 29%  numa-vmstat.node1.nr_inactive_anon
    537.25 ± 20%     +22.8%     659.50 ± 10%  numa-vmstat.node1.nr_page_table_pages
     25441 ±  2%     +44.9%      36858 ± 29%  numa-vmstat.node1.nr_zone_inactive_anon
      1649 ± 58%     +76.7%       2913 ± 32%  numa-meminfo.node0.Active
      1649 ± 58%     +76.7%       2913 ± 32%  numa-meminfo.node0.Active(anon)
    137223 ±  2%     -35.6%      88410 ± 48%  numa-meminfo.node0.AnonPages
    164997 ±  9%     -28.4%     118095 ± 42%  numa-meminfo.node0.AnonPages.max
    144353 ±  2%     -31.0%      99656 ± 43%  numa-meminfo.node0.Inactive
    144353 ±  2%     -31.0%      99656 ± 43%  numa-meminfo.node0.Inactive(anon)
      8937 ± 51%     +60.3%      14328 ±  7%  numa-meminfo.node0.Shmem
     97072 ±  3%     +51.3%     146858 ± 29%  numa-meminfo.node1.AnonPages
    127410 ±  5%     +43.2%     182468 ± 16%  numa-meminfo.node1.AnonPages.max
    101822 ±  2%     +44.9%     147521 ± 29%  numa-meminfo.node1.Inactive
    101822 ±  2%     +44.9%     147521 ± 29%  numa-meminfo.node1.Inactive(anon)
      2148 ± 20%     +22.9%       2639 ± 10%  numa-meminfo.node1.PageTables
      3431 ± 89%     -85.1%     512.25 ±109%  interrupts.38:PCI-MSI.2621444-edge.eth0-TxRx-3
    348.50 ± 62%    +152.7%     880.75 ± 27%  interrupts.40:PCI-MSI.2621446-edge.eth0-TxRx-5
      1697 ± 63%     -53.1%     796.75 ± 13%  interrupts.CPU13.CAL:Function_call_interrupts
     89.75 ± 36%    +220.3%     287.50 ± 20%  interrupts.CPU13.RES:Rescheduling_interrupts
    745.75 ±  3%    +104.6%       1526 ± 69%  interrupts.CPU19.CAL:Function_call_interrupts
    293.00 ±  5%     -60.0%     117.25 ± 47%  interrupts.CPU19.RES:Rescheduling_interrupts
    778.50 ±  9%    +123.7%       1741 ± 64%  interrupts.CPU22.CAL:Function_call_interrupts
      6450 ± 29%     -38.0%       4000 ±  4%  interrupts.CPU24.NMI:Non-maskable_interrupts
      6450 ± 29%     -38.0%       4000 ±  4%  interrupts.CPU24.PMI:Performance_monitoring_interrupts
      2012 ± 56%     -57.6%     852.75 ±  6%  interrupts.CPU26.CAL:Function_call_interrupts
    184.25 ± 37%     -47.9%      96.00 ± 49%  interrupts.CPU27.RES:Rescheduling_interrupts
      0.50 ±100%  +64250.0%     321.75 ±170%  interrupts.CPU28.TLB:TLB_shootdowns
      3431 ± 89%     -85.1%     512.25 ±109%  interrupts.CPU29.38:PCI-MSI.2621444-edge.eth0-TxRx-3
    348.50 ± 62%    +152.7%     880.75 ± 27%  interrupts.CPU31.40:PCI-MSI.2621446-edge.eth0-TxRx-5
    156.50 ± 51%     -51.3%      76.25 ± 59%  interrupts.CPU33.RES:Rescheduling_interrupts
    883.50 ± 18%     -23.8%     673.25 ± 22%  interrupts.CPU36.CAL:Function_call_interrupts
      7492 ± 13%     -45.6%       4073 ± 63%  interrupts.CPU37.NMI:Non-maskable_interrupts
      7492 ± 13%     -45.6%       4073 ± 63%  interrupts.CPU37.PMI:Performance_monitoring_interrupts
    250.50 ± 19%     -52.5%     119.00 ± 50%  interrupts.CPU37.RES:Rescheduling_interrupts
      4688 ± 27%     +63.5%       7667 ± 15%  interrupts.CPU40.NMI:Non-maskable_interrupts
      4688 ± 27%     +63.5%       7667 ± 15%  interrupts.CPU40.PMI:Performance_monitoring_interrupts
     96.75 ± 92%    +135.1%     227.50 ± 22%  interrupts.CPU43.RES:Rescheduling_interrupts
      2932 ± 36%     +73.4%       5084 ± 21%  interrupts.CPU47.NMI:Non-maskable_interrupts
      2932 ± 36%     +73.4%       5084 ± 21%  interrupts.CPU47.PMI:Performance_monitoring_interrupts
     57.50 ± 78%    +250.4%     201.50 ± 42%  interrupts.CPU47.RES:Rescheduling_interrupts
      4207 ± 61%     +86.0%       7827 ± 11%  interrupts.CPU8.NMI:Non-maskable_interrupts
      4207 ± 61%     +86.0%       7827 ± 11%  interrupts.CPU8.PMI:Performance_monitoring_interrupts
 1.089e+10            -2.3%  1.064e+10        perf-stat.i.branch-instructions
      1.62            +0.7        2.34        perf-stat.i.branch-miss-rate%
 1.741e+08           +42.3%  2.476e+08        perf-stat.i.branch-misses
      1.36            +3.3%       1.41        perf-stat.i.cpi
 1.233e+08 ±  3%      -7.1%  1.146e+08        perf-stat.i.dTLB-load-misses
  2.38e+10            -3.3%  2.302e+10        perf-stat.i.dTLB-loads
  57501510            -4.9%   54711717        perf-stat.i.dTLB-store-misses
 1.828e+10            -3.7%  1.761e+10        perf-stat.i.dTLB-stores
     98.97            -2.9       96.02 ±  2%  perf-stat.i.iTLB-load-miss-rate%
  29795797 ±  4%      -5.0%   28320171        perf-stat.i.iTLB-load-misses
    299268 ±  2%    +298.1%    1191476 ± 50%  perf-stat.i.iTLB-loads
 5.335e+10            -3.7%  5.138e+10        perf-stat.i.instructions
      0.74            -3.7%       0.71        perf-stat.i.ipc
      0.20 ±  8%     +12.1%       0.23        perf-stat.i.major-faults
      1104            -3.2%       1069        perf-stat.i.metric.M/sec
     72308            +2.3%      73975 ±  2%  perf-stat.i.node-stores
      0.10            +7.9%       0.11 ±  8%  perf-stat.overall.MPKI
      1.60            +0.7        2.33        perf-stat.overall.branch-miss-rate%
      1.35            +4.1%       1.41        perf-stat.overall.cpi
     99.00            -3.0       95.98 ±  2%  perf-stat.overall.iTLB-load-miss-rate%
      0.74            -3.9%       0.71        perf-stat.overall.ipc
 1.085e+10            -2.3%   1.06e+10        perf-stat.ps.branch-instructions
 1.735e+08           +42.3%  2.468e+08        perf-stat.ps.branch-misses
 1.229e+08 ±  3%      -7.1%  1.142e+08        perf-stat.ps.dTLB-load-misses
 2.372e+10            -3.3%  2.294e+10        perf-stat.ps.dTLB-loads
  57306258            -4.9%   54525679        perf-stat.ps.dTLB-store-misses
 1.822e+10            -3.7%  1.755e+10        perf-stat.ps.dTLB-stores
  29695158 ±  4%      -5.0%   28224049        perf-stat.ps.iTLB-load-misses
    298257 ±  2%    +298.1%    1187498 ± 50%  perf-stat.ps.iTLB-loads
 5.317e+10            -3.7%   5.12e+10        perf-stat.ps.instructions
      0.20 ±  7%     +12.0%       0.23 ±  2%  perf-stat.ps.major-faults
 1.613e+13            -3.9%   1.55e+13        perf-stat.total.instructions
      8.00 ± 14%      -8.0        0.00        perf-profile.calltrace.cycles-pp.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      7.38 ± 14%      -7.4        0.00        perf-profile.calltrace.cycles-pp.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      7.27 ± 14%      -7.3        0.00        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
      0.69 ± 14%      -0.4        0.29 ±100%  perf-profile.calltrace.cycles-pp.up_write.generic_file_write_iter.new_sync_write.vfs_write.ksys_pwrite64
      0.62 ± 15%      -0.3        0.30 ±101%  perf-profile.calltrace.cycles-pp.unlock_page.shmem_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.85 ±  8%      -0.2        0.66 ± 15%  perf-profile.calltrace.cycles-pp.__fget_light.ksys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
      0.91 ± 11%      -0.1        0.79 ± 12%  perf-profile.calltrace.cycles-pp.file_update_time.__generic_file_write_iter.generic_file_write_iter.new_sync_write.vfs_write
      0.00            +1.0        1.01 ± 13%  perf-profile.calltrace.cycles-pp.__get_user_nocheck_1.xxx_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.00            +1.4        1.42 ± 12%  perf-profile.calltrace.cycles-pp.xxx_advance.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.00            +2.1        2.15 ± 13%  perf-profile.calltrace.cycles-pp.xxx_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.00            +6.8        6.82 ± 13%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
      0.00            +6.9        6.92 ± 13%  perf-profile.calltrace.cycles-pp.copyin.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.00            +8.1        8.09 ± 14%  perf-profile.calltrace.cycles-pp.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      8.03 ± 14%      -8.0        0.00        perf-profile.children.cycles-pp.iov_iter_copy_from_user_atomic
      0.85 ±  8%      -0.2        0.66 ± 15%  perf-profile.children.cycles-pp.__fget_light
      0.69 ± 14%      -0.2        0.52 ± 15%  perf-profile.children.cycles-pp.up_write
      0.62 ± 13%      -0.2        0.46 ± 14%  perf-profile.children.cycles-pp.apparmor_file_permission
      0.94 ± 11%      -0.1        0.82 ± 13%  perf-profile.children.cycles-pp.file_update_time
      0.51 ± 12%      -0.1        0.40 ± 14%  perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited
      0.55 ± 12%      -0.1        0.47 ± 12%  perf-profile.children.cycles-pp.current_time
      0.62 ± 14%      -0.1        0.55 ± 13%  perf-profile.children.cycles-pp.unlock_page
      0.24 ± 13%      -0.0        0.20 ± 16%  perf-profile.children.cycles-pp.timestamp_truncate
      0.18 ± 11%      -0.0        0.14 ± 15%  perf-profile.children.cycles-pp.file_remove_privs
      0.55 ± 14%      +0.3        0.87 ± 15%  perf-profile.children.cycles-pp.__x86_retpoline_rax
      0.00            +1.4        1.42 ± 12%  perf-profile.children.cycles-pp.xxx_advance
      0.00            +2.2        2.22 ± 13%  perf-profile.children.cycles-pp.xxx_fault_in_readable
      0.00            +8.1        8.12 ± 14%  perf-profile.children.cycles-pp.xxx_copy_from_user_atomic
      1.02 ± 16%      -0.2        0.82 ± 12%  perf-profile.self.cycles-pp.shmem_getpage_gfp
      0.82 ±  8%      -0.2        0.63 ± 15%  perf-profile.self.cycles-pp.__fget_light
      0.66 ± 14%      -0.2        0.49 ± 15%  perf-profile.self.cycles-pp.up_write
      0.54 ± 15%      -0.2        0.39 ± 14%  perf-profile.self.cycles-pp.apparmor_file_permission
      0.59 ± 13%      -0.1        0.46 ± 13%  perf-profile.self.cycles-pp.ksys_pwrite64
      0.50 ± 12%      -0.1        0.40 ± 13%  perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited
      0.24 ± 15%      -0.0        0.19 ± 15%  perf-profile.self.cycles-pp.timestamp_truncate
      0.20 ± 13%      -0.0        0.17 ± 12%  perf-profile.self.cycles-pp.current_time
      0.12 ± 14%      +0.1        0.19 ± 14%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
      0.43 ± 14%      +0.3        0.68 ± 15%  perf-profile.self.cycles-pp.__x86_retpoline_rax
      0.00            +1.1        1.14 ± 15%  perf-profile.self.cycles-pp.xxx_copy_from_user_atomic
      0.00            +1.2        1.21 ± 12%  perf-profile.self.cycles-pp.xxx_fault_in_readable
      0.00            +1.3        1.28 ± 12%  perf-profile.self.cycles-pp.xxx_advance


                                                                                
                              will-it-scale.24.processes                        
                                                                                
  2.88e+07 +----------------------------------------------------------------+   
  2.86e+07 |-+                       +.+.+..+.                              |   
           |             +.         +         +. .+.                        |   
  2.84e+07 |.+.+.+.+.   +  +.+.+.+.+            +   +                       |   
  2.82e+07 |-+       +.+                                                    |   
           |                                                                |   
   2.8e+07 |-+                                                              |   
  2.78e+07 |-+                                                              |   
  2.76e+07 |-+                                                              |   
           |                                                                |   
  2.74e+07 |-+                                                              |   
  2.72e+07 |-O O O   O O O O O O O O O O                  O O   O           |   
           |       O                     O    O O O   O O     O   O O O O O |   
   2.7e+07 |-+                              O       O                       |   
  2.68e+07 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                             will-it-scale.per_process_ops                      
                                                                                
   1.2e+06 +----------------------------------------------------------------+   
           |                         +.+.+..+.                              |   
  1.19e+06 |-+           +.         +         +. .+.                        |   
  1.18e+06 |.+.+.+.+    +  +.+.+.+.+            +   +                       |   
           |        + .+                                                    |   
  1.17e+06 |-+       +                                                      |   
           |                                                                |   
  1.16e+06 |-+                                                              |   
           |                                                                |   
  1.15e+06 |-+                                                              |   
  1.14e+06 |-+                                                              |   
           |   O       O             O                    O                 |   
  1.13e+06 |-O   O O O   O O O O O O   O O    O O O   O O   O O O O O O     |   
           |                                O       O                   O O |   
  1.12e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
  2.88e+07 +----------------------------------------------------------------+   
  2.86e+07 |-+                       +.+.+..+.                              |   
           |             +.         +         +. .+.                        |   
  2.84e+07 |.+.+.+.+.   +  +.+.+.+.+            +   +                       |   
  2.82e+07 |-+       +.+                                                    |   
           |                                                                |   
   2.8e+07 |-+                                                              |   
  2.78e+07 |-+                                                              |   
  2.76e+07 |-+                                                              |   
           |                                                                |   
  2.74e+07 |-+                                                              |   
  2.72e+07 |-O O O   O O O O O O O O O O                  O O   O           |   
           |       O                     O    O O O   O O     O   O O O O O |   
   2.7e+07 |-+                              O       O                       |   
  2.68e+07 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Oliver Sang


View attachment "config-5.10.0-rc4-00369-g9bd0e337c633" of type "text/plain" (170131 bytes)

View attachment "job-script" of type "text/plain" (8009 bytes)

View attachment "job.yaml" of type "text/plain" (5479 bytes)

View attachment "reproduce" of type "text/plain" (339 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ