lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210519142742.GA5275@xsang-OptiPlex-9020>
Date:   Wed, 19 May 2021 22:27:42 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Nadav Amit <namit@...are.com>
Cc:     Ingo Molnar <mingo@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...el.com
Subject: [smp]  a32a4d8a81:  netperf.Throughput_tps -2.1% regression



Greeting,

FYI, we noticed a -2.1% regression of netperf.Throughput_tps due to commit:


commit: a32a4d8a815c4eb6dc64b8962dc13a9dfae70868 ("smp: Run functions concurrently in smp_call_function_many_cond()")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: netperf
on test machine: 192 threads 4 sockets Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	ip: ipv4
	runtime: 300s
	nr_threads: 1
	cluster: cs-localhost
	test: UDP_RR
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Netperf is a benchmark that can be use to measure various aspect of networking performance.
test-url: http://www.netperf.org/netperf/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml  # generate the yaml file for lkp run
        bin/lkp run                    generated-yaml-file

=========================================================================================
cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase/ucode:
  cs-localhost/gcc-9/performance/ipv4/x86_64-rhel-8.3/1/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2ap3/UDP_RR/netperf/0x5003006

commit: 
  v5.12-rc2
  a32a4d8a81 ("smp: Run functions concurrently in smp_call_function_many_cond()")

       v5.12-rc2 a32a4d8a815c4eb6dc64b8962dc 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    116903            -2.1%     114404        netperf.Throughput_total_tps
    116903            -2.1%     114404        netperf.Throughput_tps
  35066769            -2.1%   34317990        netperf.time.voluntary_context_switches
  35071059            -2.1%   34321258        netperf.workload
     67295            +1.5%      68333        proc-vmstat.nr_anon_pages
    463520            -2.1%     453603        vmstat.system.cs
    535.28 ±  6%      -8.3%     490.97 ± 10%  sched_debug.cfs_rq:/.util_est_enqueued.max
      0.02 ±  8%     -10.8%       0.02 ±  4%  sched_debug.cpu.nr_running.avg
  76309820 ±  4%    +320.0%  3.205e+08 ±158%  cpuidle.C1.time
  23409116 ±  3%     +31.0%   30676822 ± 20%  cpuidle.C1.usage
  46720133 ±  2%     -12.9%   40709940 ±  2%  cpuidle.POLL.usage
      5282 ±110%    +317.0%      22029 ± 58%  numa-vmstat.node3.nr_anon_pages
     11998 ± 55%    +138.7%      28637 ± 45%  numa-vmstat.node3.nr_inactive_anon
     11998 ± 55%    +138.7%      28637 ± 45%  numa-vmstat.node3.nr_zone_inactive_anon
      8397 ±136%    +588.7%      57827 ± 75%  numa-meminfo.node3.AnonHugePages
     21162 ±110%    +316.7%      88189 ± 58%  numa-meminfo.node3.AnonPages
     48780 ± 54%    +136.8%     115533 ± 45%  numa-meminfo.node3.Inactive
     48780 ± 54%    +136.8%     115533 ± 45%  numa-meminfo.node3.Inactive(anon)
    467040            -2.1%     457094        perf-stat.i.context-switches
      0.01 ±138%      +0.0        0.03 ± 73%  perf-stat.i.dTLB-store-miss-rate%
 9.415e+08            -2.4%  9.188e+08 ±  2%  perf-stat.i.dTLB-stores
      0.01 ±137%      +0.0        0.03 ± 73%  perf-stat.overall.dTLB-store-miss-rate%
    465472            -2.1%     455557        perf-stat.ps.context-switches
 9.385e+08            -2.4%  9.158e+08 ±  2%  perf-stat.ps.dTLB-stores
      1.21 ± 14%      +0.2        1.41 ±  5%  perf-profile.calltrace.cycles-pp.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto
      2.05 ± 10%      +0.3        2.33 ±  4%  perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
      0.06 ±  7%      +0.0        0.08 ± 14%  perf-profile.children.cycles-pp.__calc_delta
      0.08 ± 19%      +0.0        0.10 ±  9%  perf-profile.children.cycles-pp._copy_to_user
      0.09 ± 22%      +0.0        0.12 ±  8%  perf-profile.children.cycles-pp._copy_from_user
      0.12 ± 20%      +0.0        0.17 ± 13%  perf-profile.children.cycles-pp.exit_to_user_mode_prepare
      0.14 ± 11%      +0.1        0.19 ±  9%  perf-profile.children.cycles-pp.skb_release_data
      1.21 ± 14%      +0.2        1.41 ±  5%  perf-profile.children.cycles-pp.__ip_append_data
      2.07 ± 11%      +0.3        2.33 ±  4%  perf-profile.children.cycles-pp.schedule_idle
      0.06 ±  7%      +0.0        0.08 ± 11%  perf-profile.self.cycles-pp.__calc_delta
      0.19 ±  8%      +0.0        0.24 ±  6%  perf-profile.self.cycles-pp.__softirqentry_text_start
      0.24 ±  8%      +0.1        0.29 ±  4%  perf-profile.self.cycles-pp.__skb_recv_udp
      0.14 ± 11%      +0.1        0.19 ±  9%  perf-profile.self.cycles-pp.skb_release_data
      0.02 ±142%      +0.1        0.08 ± 17%  perf-profile.self.cycles-pp.sock_alloc_send_pskb
      0.11 ± 17%      +0.1        0.19 ± 13%  perf-profile.self.cycles-pp.__ip_append_data
      0.12 ± 34%      +0.1        0.26 ± 22%  perf-profile.self.cycles-pp.perf_mux_hrtimer_handler
      0.87 ± 13%      +0.2        1.05 ±  6%  perf-profile.self.cycles-pp._raw_spin_lock
      1287 ± 42%     +75.3%       2256 ± 14%  interrupts.CPU111.CAL:Function_call_interrupts
      1326 ± 43%     +71.0%       2267 ± 13%  interrupts.CPU119.CAL:Function_call_interrupts
      1300 ± 45%     +75.9%       2287 ± 37%  interrupts.CPU120.CAL:Function_call_interrupts
      1299 ± 45%     +60.1%       2081 ± 28%  interrupts.CPU128.CAL:Function_call_interrupts
      1305 ± 45%     +61.7%       2110 ± 29%  interrupts.CPU131.CAL:Function_call_interrupts
      1299 ± 45%     +61.8%       2102 ± 28%  interrupts.CPU139.CAL:Function_call_interrupts
     66.67 ±133%     -97.2%       1.83 ±155%  interrupts.CPU14.TLB:TLB_shootdowns
      1299 ± 45%    +107.8%       2700 ± 33%  interrupts.CPU142.CAL:Function_call_interrupts
    301.83 ±128%     -95.6%      13.17 ±140%  interrupts.CPU149.RES:Rescheduling_interrupts
    389.17 ± 89%     -73.5%     103.17 ± 35%  interrupts.CPU164.NMI:Non-maskable_interrupts
    389.17 ± 89%     -73.5%     103.17 ± 35%  interrupts.CPU164.PMI:Performance_monitoring_interrupts
      1299 ± 45%     +60.2%       2081 ± 28%  interrupts.CPU35.CAL:Function_call_interrupts
      1244 ± 50%     +66.8%       2076 ± 27%  interrupts.CPU45.CAL:Function_call_interrupts
      1300 ± 44%     +59.5%       2075 ± 28%  interrupts.CPU46.CAL:Function_call_interrupts
      1.50 ± 63%   +1422.2%      22.83 ±167%  interrupts.CPU47.RES:Rescheduling_interrupts
    467.33 ± 85%     -64.6%     165.67 ± 74%  interrupts.CPU58.NMI:Non-maskable_interrupts
    467.33 ± 85%     -64.6%     165.67 ± 74%  interrupts.CPU58.PMI:Performance_monitoring_interrupts
    306.67 ± 75%     -59.9%     122.83 ± 16%  interrupts.CPU68.NMI:Non-maskable_interrupts
    306.67 ± 75%     -59.9%     122.83 ± 16%  interrupts.CPU68.PMI:Performance_monitoring_interrupts
      1131 ± 27%     +61.2%       1822 ± 35%  interrupts.CPU85.CAL:Function_call_interrupts
      1180 ± 31%     +79.6%       2119 ± 24%  interrupts.CPU86.CAL:Function_call_interrupts


                                                                                
                               netperf.Throughput_tps                           
                                                                                
  121000 +------------------------------------------------------------------+   
  120000 |-+      :+                                                        |   
         |       :  +                                                       |   
  119000 |-+     :   +      +                                               |   
  118000 |-+     :   :      :+       +     +    +    +                      |   
         |.+     :    :    :  +     + +   ::   + +  :                       |   
  117000 |-++ +.:     :  +.+   +   +   +. : :.+   + :                       |   
  116000 |-+ +  +      :+       +.+      +  +      +                        |   
  115000 |-+  O        +                                         O      O   |   
         | O O    O                  O        O O  O O        O O    O O    |   
  114000 |-+         O        O    O   O   O                       O      O |   
  113000 |-+             O      O O               O    O                    |   
         |          O  O   O             O  O                               |   
  112000 |-+                O                               O               |   
  111000 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.12.0-rc2-00001-ga32a4d8a815c" of type "text/plain" (172872 bytes)

View attachment "job-script" of type "text/plain" (8036 bytes)

View attachment "job.yaml" of type "text/plain" (5441 bytes)

View attachment "reproduce" of type "text/plain" (325 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ