lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20170204070804.GC12121@yexl-desktop>
Date:   Sat, 4 Feb 2017 15:08:04 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Manfred Spraul <manfred@...orfullife.com>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        Peter Zijlstra <peterz@...radead.org>,
        Davidlohr Bueso <dave@...olabs.net>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...e.hu>, "H. Peter Anvin" <hpa@...or.com>,
        kernel test robot <xiaolong.ye@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp-robot] [ipc/sem.c]  f4b5bafaf7: aim9.shared_memory.ops_per_sec
 11.3% improvement


Greeting,

FYI, we noticed a 11.3% improvement of aim9.shared_memory.ops_per_sec due to commit:


commit: f4b5bafaf7c0a3b2f204e48c07b5335ed93266fa ("ipc/sem.c: avoid using spin_unlock_wait()")
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master

in testcase: aim9
on test machine: 48 threads Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz with 64G memory
with following parameters:

	testtime: 300s
	test: shared_memory
	cpufreq_governor: performance

test-description: Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.
test-url: https://sourceforge.net/projects/aimbench/files/aim-suite9/

In addition to that, the commit also has significant impact on the following tests:

+------------------+------------------------------------------------------------------+
| testcase: change | aim9: aim9.shared_memory.ops_per_sec 11.5% improvement           |
| test machine     | 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G memory |
| test parameters  | cpufreq_governor=performance                                     |
|                  | test=shared_memory                                               |
|                  | testtime=300s                                                    |
+------------------+------------------------------------------------------------------+


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

testcase/path_params/tbox_group/run: aim9/300s-shared_memory-performance/ivb43

6487b8d2876d7d39  f4b5bafaf7c0a3b2f204e48c07
----------------  --------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
   1073533 ±  0%     +11.3%    1194345 ±  0%  aim9.shared_memory.ops_per_sec
   3221639 ±  0%     +11.2%    3584021 ±  0%  aim9.time.minor_page_faults
     28206 ±  8%     -12.5%      24690 ±  0%  meminfo.Active(file)
      3.56 ±  1%      -5.0%       3.38 ±  4%  turbostat.RAMWatt
     14128 ±  9%     -12.8%      12326 ±  1%  numa-meminfo.node0.Active(file)
     14081 ±  8%     -12.2%      12365 ±  0%  numa-meminfo.node1.Active(file)
      7051 ±  8%     -12.5%       6172 ±  0%  proc-vmstat.nr_active_file
      7051 ±  8%     -12.5%       6172 ±  0%  proc-vmstat.nr_zone_active_file
   3221639 ±  0%     +11.2%    3584021 ±  0%  time.minor_page_faults
     41.48 ±  1%      +9.7%      45.50 ±  1%  time.user_time
      3531 ±  9%     -12.7%       3081 ±  0%  numa-vmstat.node0.nr_active_file
      3531 ±  9%     -12.7%       3081 ±  0%  numa-vmstat.node0.nr_zone_active_file
      3520 ±  8%     -12.2%       3091 ±  0%  numa-vmstat.node1.nr_active_file
      3520 ±  8%     -12.2%       3091 ±  0%  numa-vmstat.node1.nr_zone_active_file
      1.26 ± 16%     -70.4%       0.37 ± 71%  perf-profile.calltrace.cycles-pp.pid_vnr.SYSC_semtimedop.sys_semop.entry_SYSCALL_64_fastpath
      1.38 ± 18%     -53.1%       0.65 ±  8%  perf-profile.children.cycles-pp.pid_vnr
      8.29 ±  8%     -37.2%       5.20 ± 10%  perf-profile.self.cycles-pp.SYSC_semtimedop
      1.37 ± 19%     -57.5%       0.58 ± 14%  perf-profile.self.cycles-pp.pid_vnr
     76641 ± 27%     +64.8%     126335 ± 25%  slabinfo.kmalloc-8.active_objs
     76927 ± 27%     +64.5%     126565 ± 25%  slabinfo.kmalloc-8.num_objs
    839.50 ±  4%     -13.0%     730.00 ±  8%  slabinfo.nsproxy.active_objs
    839.50 ±  4%     -13.0%     730.00 ±  8%  slabinfo.nsproxy.num_objs
     15877 ±  4%      -6.7%      14819 ±  4%  slabinfo.vm_area_struct.active_objs
     15877 ±  4%      -6.7%      14819 ±  4%  slabinfo.vm_area_struct.num_objs
      0.09 ±110%    +188.8%       0.26 ± 31%  sched_debug.cfs_rq:/.nr_spread_over.stddev
     12.61 ± 31%     -35.8%       8.10 ± 31%  sched_debug.cfs_rq:/.removed_util_avg.stddev
      7.42 ± 48%     -48.3%       3.83 ± 83%  sched_debug.cfs_rq:/.util_avg.min
    341584 ±  4%     +31.4%     448942 ±  5%  sched_debug.cpu.avg_idle.min
    138800 ±  3%      -8.5%     127032 ±  1%  sched_debug.cpu.avg_idle.stddev
      1134 ± 12%     -23.0%     873.83 ± 16%  sched_debug.cpu.nr_switches.min
    628.08 ± 39%     -66.7%     209.39 ± 59%  sched_debug.cpu.sched_count.min
    215.04 ± 61%     -91.7%      17.89 ± 88%  sched_debug.cpu.sched_goidle.min
    132.92 ± 30%     -43.5%      75.06 ± 39%  sched_debug.cpu.ttwu_count.min
 3.713e+11 ±  4%     +20.5%  4.476e+11 ±  7%  perf-stat.branch-instructions
      1.32 ±  5%      -9.1%       1.20 ±  3%  perf-stat.branch-miss-rate%
 4.887e+09 ±  5%      +9.4%  5.348e+09 ±  3%  perf-stat.branch-misses
      0.13 ±  6%     -16.6%       0.11 ±  5%  perf-stat.dTLB-load-miss-rate%
 4.368e+11 ±  6%     +17.7%  5.139e+11 ±  0%  perf-stat.dTLB-loads
      0.04 ± 10%     -13.3%       0.04 ±  0%  perf-stat.dTLB-store-miss-rate%
 2.071e+12 ±  4%     +20.4%  2.494e+12 ±  6%  perf-stat.instructions
     12681 ±  4%     +18.0%      14964 ±  5%  perf-stat.instructions-per-iTLB-miss
      0.92 ±  1%     +11.8%       1.03 ±  1%  perf-stat.ipc
   3784094 ±  0%      +9.5%    4145210 ±  0%  perf-stat.minor-faults
   3784100 ±  0%      +9.5%    4145210 ±  0%  perf-stat.page-faults



                                 perf-stat.page-faults

  4.5e+06 ++----------------------------------------------------------------+
          O OO O OO O O OO O  O O O                                         |
    4e+06 *+**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
  3.5e+06 ++                                                                |
          |                                                                 |
    3e+06 ++                                                                |
  2.5e+06 ++                                                                |
          |                                                                 |
    2e+06 ++                                                                |
  1.5e+06 ++                                                                |
          |                                                                 |
    1e+06 ++                                                                |
   500000 ++                                                                |
          |                                                                 |
        0 ++-----------------O----------------------------------------------+


                                perf-stat.minor-faults

  4.5e+06 ++----------------------------------------------------------------+
          O OO O OO O O OO O  O O O                                         |
    4e+06 *+**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
  3.5e+06 ++                                                                |
          |                                                                 |
    3e+06 ++                                                                |
  2.5e+06 ++                                                                |
          |                                                                 |
    2e+06 ++                                                                |
  1.5e+06 ++                                                                |
          |                                                                 |
    1e+06 ++                                                                |
   500000 ++                                                                |
          |                                                                 |
        0 ++-----------------O----------------------------------------------+


                            aim9.shared_memory.ops_per_sec

  1.2e+06 O+OO-O-OO-O-O-OO-O--O-O-O-----------------------------------------+
          *.**.*.**.*.*.**.*.**.*.**.*.*       *.*.**. .**.*. *.     *.*. *.*
    1e+06 ++                            *.*.*.*       *      *  *.*.*    *  |
          |                                                                 |
          |                                                                 |
   800000 ++                                                                |
          |                                                                 |
   600000 ++                                                                |
          |                                                                 |
   400000 ++                                                                |
          |                                                                 |
          |                                                                 |
   200000 ++                                                                |
          |                                                                 |
        0 ++-----------------O----------------------------------------------+


                              aim9.time.minor_page_faults

    4e+06 ++----------------------------------------------------------------+
          |           O OO O  O O                                           |
  3.5e+06 O+OO O OO.O.*.          O*.*.                                     |
    3e+06 *+**.*.**     **.*.**.*.*    **.*.*.**.*.**.*.**.*.**.*.*.**.*.**.*
          |                                                                 |
  2.5e+06 ++                                                                |
          |                                                                 |
    2e+06 ++                                                                |
          |                                                                 |
  1.5e+06 ++                                                                |
    1e+06 ++                                                                |
          |                                                                 |
   500000 ++                                                                |
          |                                                                 |
        0 ++-----------------O----------------------------------------------+

	[*] bisect-good sample
	[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Xiaolong

View attachment "config-4.10.0-rc4-00447-gf4b5baf" of type "text/plain" (155600 bytes)

View attachment "job-script" of type "text/plain" (6468 bytes)

View attachment "job.yaml" of type "text/plain" (4138 bytes)

View attachment "reproduce" of type "text/plain" (103 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ