lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Tue, 3 Aug 2021 10:19:35 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Ard Biesheuvel <ardb@...nel.org>
Cc:     0day robot <lkp@...el.com>, Dave Hansen <dave.hansen@...el.com>,
        syzbot <syzbot+5d1bad8042a8f0e8117a@...kaller.appspotmail.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com, linux-crypto@...r.kernel.org,
        herbert@...dor.apana.org.au, Ard Biesheuvel <ardb@...nel.org>
Subject: [crypto]  ed610fb786:  will-it-scale.per_thread_ops 4.6% improvement



Greeting,

FYI, we noticed a 4.6% improvement of will-it-scale.per_thread_ops due to commit:


commit: ed610fb7867eff838da286444016ae58fa4f86b2 ("[PATCH] crypto: x86/aes-ni - add missing error checks in XTS code")
url: https://github.com/0day-ci/linux/commits/Ard-Biesheuvel/crypto-x86-aes-ni-add-missing-error-checks-in-XTS-code/20210718-100125
base: https://git.kernel.org/cgit/linux/kernel/git/herbert/cryptodev-2.6.git master

in testcase: will-it-scale
on test machine: 88 threads 2 sockets Intel(R) Xeon(R) Gold 6238M CPU @ 2.10GHz with 128G memory
with following parameters:

	nr_task: 100%
	mode: thread
	test: getppid1
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml  # generate the yaml file for lkp run
        bin/lkp run                    generated-yaml-file

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/thread/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2sp9/getppid1/will-it-scale/0x5003006

commit: 
  66192b2e3f ("crypto: hisilicon/sec - fix the process of disabling sva prefetching")
  ed610fb786 ("crypto: x86/aes-ni - add missing error checks in XTS code")

66192b2e3fd8ab97 ed610fb7867eff838da28644401 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  7.27e+08            +4.6%  7.601e+08        will-it-scale.88.threads
   8261617            +4.6%    8637651        will-it-scale.per_thread_ops
  7.27e+08            +4.6%  7.601e+08        will-it-scale.workload
      2267 ± 19%     -19.8%       1818 ± 16%  interrupts.CPU33.CAL:Function_call_interrupts
    425.50 ± 26%     -21.0%     336.29 ±  9%  interrupts.CPU37.RES:Rescheduling_interrupts
    622.50 ± 53%     -43.9%     349.43 ± 20%  interrupts.CPU63.RES:Rescheduling_interrupts
      6505 ± 28%     -14.7%       5546 ± 34%  interrupts.CPU75.NMI:Non-maskable_interrupts
      6505 ± 28%     -14.7%       5546 ± 34%  interrupts.CPU75.PMI:Performance_monitoring_interrupts
    829.33 ± 54%     -56.0%     365.29 ± 27%  interrupts.CPU77.RES:Rescheduling_interrupts
      2423 ±  3%     -21.5%       1903 ± 42%  perf-sched.wait_and_delay.avg.ms.do_syslog.part.0.kmsg_read.vfs_read
      7265 ±  3%     -21.5%       5706 ± 42%  perf-sched.wait_and_delay.max.ms.do_syslog.part.0.kmsg_read.vfs_read
     23.15 ± 26%   +4566.5%       1080 ±237%  perf-sched.wait_and_delay.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6202 ±  8%     -17.1%       5139 ± 10%  perf-sched.wait_and_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork
      2421 ±  3%     -21.4%       1903 ± 42%  perf-sched.wait_time.avg.ms.do_syslog.part.0.kmsg_read.vfs_read
      0.16 ± 11%     -17.4%       0.13 ± 19%  perf-sched.wait_time.avg.ms.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt.[unknown]
      7261 ±  3%     -21.4%       5705 ± 42%  perf-sched.wait_time.max.ms.do_syslog.part.0.kmsg_read.vfs_read
     22.98 ± 27%   +4602.1%       1080 ±237%  perf-sched.wait_time.max.ms.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6201 ±  8%     -17.1%       5139 ± 10%  perf-sched.wait_time.max.ms.smpboot_thread_fn.kthread.ret_from_fork
 3.779e+10            +4.6%  3.954e+10        perf-stat.i.branch-instructions
      0.49            -0.0        0.47        perf-stat.i.branch-miss-rate%
    434873 ±  2%     +10.3%     479520 ±  4%  perf-stat.i.cache-misses
      1.38            -4.5%       1.31        perf-stat.i.cpi
    741687 ±  2%     -12.1%     652145 ±  6%  perf-stat.i.cycles-between-cache-misses
 6.022e+10            +4.6%  6.299e+10        perf-stat.i.dTLB-loads
      0.00            -0.0        0.00        perf-stat.i.dTLB-store-miss-rate%
    149137            -1.6%     146697        perf-stat.i.dTLB-store-misses
  4.21e+10            +4.5%  4.401e+10        perf-stat.i.dTLB-stores
 2.354e+08 ±  2%      -4.9%  2.238e+08        perf-stat.i.iTLB-load-misses
 1.759e+11            +4.6%  1.841e+11        perf-stat.i.instructions
    755.18 ±  2%      +9.5%     826.97        perf-stat.i.instructions-per-iTLB-miss
      0.73            +4.7%       0.76        perf-stat.i.ipc
      1592            +4.6%       1665        perf-stat.i.metric.M/sec
      0.49            -0.0        0.47        perf-stat.overall.branch-miss-rate%
      1.38            -4.5%       1.32        perf-stat.overall.cpi
    549393 ±  2%      -9.0%     500080 ±  4%  perf-stat.overall.cycles-between-cache-misses
      0.00            -0.0        0.00        perf-stat.overall.dTLB-store-miss-rate%
    748.01 ±  2%     +10.0%     822.71        perf-stat.overall.instructions-per-iTLB-miss
      0.73            +4.7%       0.76        perf-stat.overall.ipc
 3.767e+10            +4.6%  3.941e+10        perf-stat.ps.branch-instructions
    439954 ±  2%      +9.9%     483612 ±  4%  perf-stat.ps.cache-misses
 6.002e+10            +4.6%  6.278e+10        perf-stat.ps.dTLB-loads
    148856            -1.6%     146461        perf-stat.ps.dTLB-store-misses
 4.196e+10            +4.5%  4.386e+10        perf-stat.ps.dTLB-stores
 2.346e+08 ±  2%      -4.9%  2.231e+08        perf-stat.ps.iTLB-load-misses
 1.754e+11            +4.6%  1.835e+11        perf-stat.ps.instructions
 5.289e+13            +4.6%  5.534e+13        perf-stat.total.instructions
     13.13            -2.6       10.51 ±  3%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
     32.70            -2.4       30.28        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
     40.62            -2.4       38.22        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.getppid
      9.17            -1.6        7.55 ±  4%  perf-profile.calltrace.cycles-pp.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
      2.42 ±  2%      -0.9        1.50 ±  3%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
     94.55            -0.3       94.26        perf-profile.calltrace.cycles-pp.getppid
      3.00            +0.1        3.12        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_safe_stack.getppid
      2.98            +0.2        3.20        perf-profile.calltrace.cycles-pp.syscall_enter_from_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.getppid
      8.91            +0.5        9.37        perf-profile.calltrace.cycles-pp.testcase
     42.46            +2.1       44.56        perf-profile.calltrace.cycles-pp.__entry_text_start.getppid
     14.20            -2.6       11.64 ±  3%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode
     33.60            -2.4       31.19        perf-profile.children.cycles-pp.do_syscall_64
     41.30            -2.3       38.96        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      9.50            -1.6        7.92 ±  4%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      2.60 ±  2%      -0.8        1.81 ±  2%  perf-profile.children.cycles-pp.syscall_exit_to_user_mode_prepare
      1.21            -0.1        1.07        perf-profile.children.cycles-pp.rcu_read_unlock_strict
      3.20            +0.1        3.34        perf-profile.children.cycles-pp.entry_SYSCALL_64_safe_stack
      3.06            +0.2        3.28        perf-profile.children.cycles-pp.syscall_enter_from_user_mode
      5.28            +0.2        5.52        perf-profile.children.cycles-pp.testcase
     22.43            +0.7       23.10        perf-profile.children.cycles-pp.syscall_return_via_sysret
     27.43            +1.3       28.77        perf-profile.children.cycles-pp.__entry_text_start
      8.32            -1.5        6.81 ±  4%  perf-profile.self.cycles-pp.exit_to_user_mode_prepare
      2.24 ±  3%      -0.8        1.43 ±  2%  perf-profile.self.cycles-pp.syscall_exit_to_user_mode_prepare
      3.90            -0.2        3.69        perf-profile.self.cycles-pp.__x64_sys_getppid
      0.87            -0.1        0.73        perf-profile.self.cycles-pp.rcu_read_unlock_strict
      2.43            -0.1        2.31        perf-profile.self.cycles-pp.syscall_exit_to_user_mode
      0.81 ±  3%      -0.1        0.72 ±  3%  perf-profile.self.cycles-pp.rcu_nocb_flush_deferred_wakeup
      7.82            +0.1        7.90        perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe
      3.17            +0.1        3.31        perf-profile.self.cycles-pp.entry_SYSCALL_64_safe_stack
      2.36            +0.2        2.55        perf-profile.self.cycles-pp.syscall_enter_from_user_mode
     12.36            +0.6       12.96        perf-profile.self.cycles-pp.__entry_text_start
     22.14            +0.6       22.77        perf-profile.self.cycles-pp.syscall_return_via_sysret
     19.17            +1.0       20.14        perf-profile.self.cycles-pp.getppid


                                                                                
                             will-it-scale.per_thread_ops                       
                                                                                
  8.8e+06 +-----------------------------------------------------------------+   
          |     O O OO O                                                    |   
  8.7e+06 |O+O O   O                                                        |   
          |                 O        O                                      |   
          |             OO O OO OOOO  OOO                                   |   
  8.6e+06 |-+                                                               |   
          |   O                                                             |   
  8.5e+06 |-+                                                               |   
          |+.++                    +.++++.+                                 |   
  8.4e+06 |-+  +      .++    ++.   :      :                                 |   
          |     +.++++   +.++   +++       :                                 |   
          |                                :                                |   
  8.3e+06 |-+                              :   +  .+ +    +.           +    |   
          |                                + .+ :+  + +.++  ++++.++++.+ ++.+|   
  8.2e+06 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.13.0-rc1-00150-ged610fb7867e" of type "text/plain" (173972 bytes)

View attachment "job-script" of type "text/plain" (8115 bytes)

View attachment "job.yaml" of type "text/plain" (5352 bytes)

View attachment "reproduce" of type "text/plain" (339 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ