lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20201122150415.GJ2390@xsang-OptiPlex-9020>
Date:   Sun, 22 Nov 2020 23:04:15 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Valentin Schneider <valentin.schneider@....com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        0day robot <lkp@...el.com>, lkp@...ts.01.org,
        ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com,
        aubrey.li@...ux.intel.com, yu.c.chen@...el.com
Subject: [sched/numa]  e7f28850ea:  unixbench.score 1.5% improvement


Greeting,

FYI, we noticed a 1.5% improvement of unixbench.score due to commit:


commit: e7f28850eadc14c0976f7872f2ddfef7a0a1d9f4 ("[PATCH 3/3] sched/numa: Limit the amount of imbalance that can exist at fork time")
url: https://github.com/0day-ci/linux/commits/Mel-Gorman/Revisit-NUMA-imbalance-tolerance-and-fork-balancing/20201117-214609
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git dc824eb898534cd8e34582874dae3bb7cf2fa008

in testcase: unixbench
on test machine: 96 threads Intel(R) Xeon(R) CPU @ 2.30GHz with 128G memory
with following parameters:

	runtime: 300s
	nr_task: 30%
	test: pipe
	cpufreq_governor: performance
	ucode: 0x4003003

test-description: UnixBench is the original BYTE UNIX benchmark suite aims to test performance of Unix-like system.
test-url: https://github.com/kdlucas/byte-unixbench





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/nr_task/rootfs/runtime/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/30%/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2sp4/pipe/unixbench/0x4003003

commit: 
  b619be42c0 ("sched/numa: Allow a floating imbalance between NUMA nodes")
  e7f28850ea ("sched/numa: Limit the amount of imbalance that can exist at fork time")

b619be42c0eab221 e7f28850eadc14c0976f7872f2d 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
     41663            +1.5%      42289        unixbench.score
     17225            -2.7%      16754        unixbench.time.involuntary_context_switches
 2.025e+10            +1.4%  2.054e+10        unixbench.workload
      0.30 ±101%      +0.3        0.65 ±  8%  perf-profile.calltrace.cycles-pp.__entry_text_start.read
      0.38 ± 13%      +0.1        0.46 ± 10%  perf-profile.self.cycles-pp.write
      7969 ±  6%     +13.7%       9064 ±  7%  numa-vmstat.node0.nr_kernel_stack
      9007 ±  5%     -11.6%       7966 ±  8%  numa-vmstat.node1.nr_kernel_stack
     11370 ± 14%     -18.2%       9295 ± 12%  numa-vmstat.node1.nr_slab_reclaimable
      7971 ±  6%     +13.7%       9063 ±  7%  numa-meminfo.node0.KernelStack
     45482 ± 14%     -18.2%      37186 ± 12%  numa-meminfo.node1.KReclaimable
      9000 ±  5%     -11.4%       7974 ±  8%  numa-meminfo.node1.KernelStack
     45482 ± 14%     -18.2%      37186 ± 12%  numa-meminfo.node1.SReclaimable
     27.83 ±  5%     -20.4%      22.16 ± 12%  sched_debug.cfs_rq:/.load_avg.avg
    260.86 ±  7%     -13.0%     226.86 ± 12%  sched_debug.cfs_rq:/.load_avg.max
     64.01 ± 30%     -77.2%      14.62 ±173%  sched_debug.cfs_rq:/.removed.runnable_avg.max
     64.01 ± 30%     -77.2%      14.62 ±173%  sched_debug.cfs_rq:/.removed.util_avg.max
    140029 ±  6%     -15.5%     118380 ±  9%  sched_debug.cpu.avg_idle.stddev
      6.87 ±117%     -70.0%       2.06 ±  5%  perf-stat.i.MPKI
 2.058e+10            +1.6%  2.092e+10        perf-stat.i.branch-instructions
 1.776e+08            +1.5%  1.802e+08        perf-stat.i.branch-misses
 3.084e+10            +1.6%  3.134e+10        perf-stat.i.dTLB-loads
 1.937e+10            +1.6%  1.968e+10        perf-stat.i.dTLB-stores
 1.831e+08            +2.0%  1.868e+08        perf-stat.i.iTLB-load-misses
 1.025e+11            +1.6%  1.042e+11        perf-stat.i.instructions
      1.28            +2.0%       1.31        perf-stat.i.ipc
    737.53            +1.6%     749.52        perf-stat.i.metric.M/sec
      0.67            -1.3%       0.66        perf-stat.overall.cpi
      1.49            +1.3%       1.51        perf-stat.overall.ipc
 2.056e+10            +1.6%  2.089e+10        perf-stat.ps.branch-instructions
 1.774e+08            +1.4%    1.8e+08        perf-stat.ps.branch-misses
  3.08e+10            +1.6%   3.13e+10        perf-stat.ps.dTLB-loads
 1.935e+10            +1.6%  1.965e+10        perf-stat.ps.dTLB-stores
 1.829e+08            +2.0%  1.865e+08        perf-stat.ps.iTLB-load-misses
 1.024e+11            +1.6%   1.04e+11        perf-stat.ps.instructions
 4.017e+13            +1.7%  4.085e+13        perf-stat.total.instructions
     40627 ±  4%      +9.9%      44633 ±  2%  softirqs.CPU10.SCHED
     40722            +7.9%      43959 ±  3%  softirqs.CPU11.SCHED
     14454 ±  5%     +16.4%      16827 ±  8%  softirqs.CPU15.RCU
     14800 ±  8%     +21.4%      17968 ± 10%  softirqs.CPU16.RCU
     15254 ±  7%     +16.9%      17835 ±  7%  softirqs.CPU17.RCU
     14676 ± 11%     +19.3%      17502 ±  8%  softirqs.CPU18.RCU
     15098 ±  6%     +15.7%      17472 ±  8%  softirqs.CPU19.RCU
     14311 ±  5%     +23.0%      17595 ±  6%  softirqs.CPU21.RCU
     15728 ±  3%     +14.2%      17965 ± 10%  softirqs.CPU22.RCU
     15758 ±  6%     +14.3%      18005 ±  8%  softirqs.CPU23.RCU
     15700           +14.8%      18018 ±  6%  softirqs.CPU49.RCU
     15386 ±  3%     +15.4%      17757 ±  8%  softirqs.CPU50.RCU
     16064 ±  3%     +14.9%      18455 ±  8%  softirqs.CPU52.RCU
     16072 ±  3%     +19.5%      19200 ±  4%  softirqs.CPU54.RCU
     16371 ±  4%     +12.9%      18479 ±  5%  softirqs.CPU58.RCU
     15825 ±  3%     +14.5%      18116 ±  6%  softirqs.CPU59.RCU
     16359 ±  5%     +13.6%      18592 ±  7%  softirqs.CPU60.RCU
     16020 ±  7%     +14.7%      18370 ±  7%  softirqs.CPU62.RCU
     15940 ±  6%     +17.6%      18740 ±  8%  softirqs.CPU63.RCU
     15520 ±  4%     +25.0%      19403 ±  7%  softirqs.CPU64.RCU
     16212 ±  8%     +19.4%      19354 ± 11%  softirqs.CPU65.RCU
     16164 ±  7%     +19.1%      19247 ±  9%  softirqs.CPU67.RCU
     16678 ±  6%     +17.5%      19592 ±  9%  softirqs.CPU68.RCU
     16328 ±  6%     +19.7%      19551 ±  6%  softirqs.CPU69.RCU
     16351 ±  4%     +17.2%      19155 ±  7%  softirqs.CPU70.RCU
     15636 ±  2%     +11.1%      17370 ±  6%  softirqs.CPU72.RCU
     15764           +13.9%      17949 ±  7%  softirqs.CPU75.RCU
     15899 ±  3%     +13.3%      18015 ±  7%  softirqs.CPU76.RCU
     16157 ±  4%     +11.2%      17967 ±  8%  softirqs.CPU77.RCU
     15480 ±  2%     +14.4%      17716 ± 10%  softirqs.CPU91.RCU
     16142           +10.9%      17893 ±  7%  softirqs.CPU93.RCU
     16424 ±  3%     +12.7%      18503 ±  7%  softirqs.CPU95.RCU
     38301 ±  7%     -12.0%      33723 ±  8%  softirqs.CPU95.SCHED
   1550393 ±  2%     +10.6%    1714970 ±  6%  softirqs.RCU
     56868            -4.8%      54162        interrupts.CAL:Function_call_interrupts
    788.75 ± 25%     -30.4%     548.75 ± 15%  interrupts.CPU1.CAL:Function_call_interrupts
    120.75 ± 37%     -56.5%      52.50 ± 41%  interrupts.CPU10.RES:Rescheduling_interrupts
    498.50 ±  9%     -12.7%     435.25 ±  2%  interrupts.CPU12.CAL:Function_call_interrupts
     94.50 ± 14%     -51.6%      45.75 ± 37%  interrupts.CPU12.RES:Rescheduling_interrupts
    658.25 ± 27%     -32.4%     445.25 ±  2%  interrupts.CPU14.CAL:Function_call_interrupts
    613.75 ± 24%     -33.6%     407.50 ± 22%  interrupts.CPU2.CAL:Function_call_interrupts
    626.25 ± 19%     -26.8%     458.25 ±  5%  interrupts.CPU23.CAL:Function_call_interrupts
    851.00 ± 19%     -30.7%     590.00 ±  6%  interrupts.CPU25.CAL:Function_call_interrupts
    474.75 ±  4%     +16.4%     552.50 ±  7%  interrupts.CPU32.CAL:Function_call_interrupts
     58.75 ± 10%     +59.6%      93.75 ± 13%  interrupts.CPU36.RES:Rescheduling_interrupts
     73.00 ± 29%    +111.0%     154.00 ± 59%  interrupts.CPU37.RES:Rescheduling_interrupts
      3002 ± 43%     -85.0%     449.25 ±112%  interrupts.CPU40.NMI:Non-maskable_interrupts
      3002 ± 43%     -85.0%     449.25 ±112%  interrupts.CPU40.PMI:Performance_monitoring_interrupts
      3355 ± 28%     -77.4%     757.50 ± 93%  interrupts.CPU42.NMI:Non-maskable_interrupts
      3355 ± 28%     -77.4%     757.50 ± 93%  interrupts.CPU42.PMI:Performance_monitoring_interrupts
      3557 ± 29%     -51.7%       1718 ± 61%  interrupts.CPU46.NMI:Non-maskable_interrupts
      3557 ± 29%     -51.7%       1718 ± 61%  interrupts.CPU46.PMI:Performance_monitoring_interrupts
      2004 ± 43%     -55.9%     884.75 ± 80%  interrupts.CPU61.NMI:Non-maskable_interrupts
      2004 ± 43%     -55.9%     884.75 ± 80%  interrupts.CPU61.PMI:Performance_monitoring_interrupts
    609.25 ± 88%    +496.3%       3632 ± 64%  interrupts.CPU62.NMI:Non-maskable_interrupts
    609.25 ± 88%    +496.3%       3632 ± 64%  interrupts.CPU62.PMI:Performance_monitoring_interrupts
     52.75 ± 56%    +133.2%     123.00 ± 42%  interrupts.CPU63.RES:Rescheduling_interrupts
    441.25 ± 78%    +744.6%       3727 ± 64%  interrupts.CPU69.NMI:Non-maskable_interrupts
    441.25 ± 78%    +744.6%       3727 ± 64%  interrupts.CPU69.PMI:Performance_monitoring_interrupts
     48.50 ± 58%    +152.6%     122.50 ± 37%  interrupts.CPU69.RES:Rescheduling_interrupts
    408.25 ± 74%    +610.8%       2901 ± 67%  interrupts.CPU70.NMI:Non-maskable_interrupts
    408.25 ± 74%    +610.8%       2901 ± 67%  interrupts.CPU70.PMI:Performance_monitoring_interrupts
     57.00 ± 68%     +95.6%     111.50 ± 32%  interrupts.CPU70.RES:Rescheduling_interrupts
     57.00 ± 45%    +120.2%     125.50 ± 30%  interrupts.CPU71.RES:Rescheduling_interrupts
    712.50 ± 24%     -34.3%     468.00 ±  7%  interrupts.CPU75.CAL:Function_call_interrupts
      1236 ±111%    +258.1%       4426 ± 20%  interrupts.CPU75.NMI:Non-maskable_interrupts
      1236 ±111%    +258.1%       4426 ± 20%  interrupts.CPU75.PMI:Performance_monitoring_interrupts
      1953 ± 42%     -84.3%     306.75 ±108%  interrupts.CPU88.NMI:Non-maskable_interrupts
      1953 ± 42%     -84.3%     306.75 ±108%  interrupts.CPU88.PMI:Performance_monitoring_interrupts
      1077 ± 39%     -47.5%     565.50 ± 11%  interrupts.CPU90.CAL:Function_call_interrupts


                                                                                
                                   unixbench.score                              
                                                                                
  42600 +-------------------------------------------------------------------+   
        |                 O O O O O                                         |   
  42400 |-+                                                                 |   
        | O O O O O O O O                                                   |   
        |                             O O O O O O O O                       |   
  42200 |-+                                                                 |   
        |                                                                   |   
  42000 |-+                                                                 |   
        |                                                                   |   
  41800 |.+          .+.       .+.   .+.+                                   |   
        | +.+.+. .+.+   +.    +   +.+    :                   .+.+.       .+.|   
        |       +         +. +           :         .+.+.   .+     +.+.+.+   |   
  41600 |-+                 +             :.+. .+.+     +.+                 |   
        |                                 +   +                             |   
  41400 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                  unixbench.workload                            
                                                                                
  2.08e+10 +----------------------------------------------------------------+   
           |                                                                |   
  2.07e+10 |-+                O       O                                     |   
           |                O     O O                                       |   
  2.06e+10 |-+ O     OO O O     O                                           |   
           | O   O O                    O    O O     O                      |   
  2.05e+10 |-+                            O O    O O                        |   
           |                                                                |   
  2.04e+10 |-+                                                              |   
           |                           .+.+                                 |   
  2.03e+10 |++  .+. .++.+.       .+.+.+    :                  .+.+        +.|   
           | +.+   +      +.    +          :         +.+   +.+    :  .+. +  |   
  2.02e+10 |-+              +. +            ++.+.+. +   + +       +.+   +   |   
           |                  +                    +     +                  |   
  2.01e+10 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Oliver Sang


View attachment "config-5.10.0-rc1-00071-ge7f28850eadc" of type "text/plain" (170389 bytes)

View attachment "job-script" of type "text/plain" (8021 bytes)

View attachment "job.yaml" of type "text/plain" (5442 bytes)

View attachment "reproduce" of type "text/plain" (290 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ