lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 17 Oct 2016 10:25:04 +0800
From:   kernel test robot <xiaolong.ye@...el.com>
To:     Manfred Spraul <manfred@...orfullife.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Davidlohr Bueso <dave@...olabs.net>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...e.hu>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: [lkp] [ipc/sem.c]  5864a2fd30:  aim9.shared_memory.ops_per_sec
 -13.0% regression


FYI, we noticed a -13.0% regression of aim9.shared_memory.ops_per_sec due to commit:

commit 5864a2fd3088db73d47942370d0f7210a807b9bc ("ipc/sem.c: fix complex_count vs. simple op race")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

in testcase: aim9
on test machine: 144 threads Intel(R) Xeon(R) CPU E7-8890 v3 @ 2.50GHz with 512G memory
with following parameters:

	testtime: 300s
	test: shared_memory
	cpufreq_governor: performance

Suite IX is the "AIM Independent Resource Benchmark:" the famous synthetic benchmark.



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-6/performance/x86_64-rhel-7.2/debian-x86_64-2016-08-31.cgz/lkp-hsx04/shared_memory/aim9/300s

commit: 
  65deb8af76 ("kcov: do not instrument lib/stackdepot.c")
  5864a2fd30 ("ipc/sem.c: fix complex_count vs. simple op race")

65deb8af76defeae 5864a2fd3088db73d47942370d 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   1450659 ±  0%     -13.0%    1262704 ±  0%  aim9.shared_memory.ops_per_sec
   4353113 ±  0%     -13.0%    3789220 ±  0%  aim9.time.minor_page_faults
    251.75 ±  0%      +2.5%     258.09 ±  0%  aim9.time.system_time
     48.25 ±  0%     -13.1%      41.91 ±  0%  aim9.time.user_time
    967475 ±  8%     -26.3%     713361 ±  9%  cpuidle.C1-HSW.time
   6351555 ±  7%     +10.8%    7036941 ±  0%  meminfo.DirectMap2M
   4856604 ±  0%     -40.6%    2882512 ± 65%  numa-numastat.node0.local_node
      7725 ± 74%     +80.1%      13913 ±  0%  numa-numastat.node0.numa_foreign
   4856610 ±  0%     -40.6%    2882518 ± 65%  numa-numastat.node0.numa_hit
      7725 ± 74%     +80.1%      13913 ±  0%  numa-numastat.node0.numa_miss
     16976 ±  2%     +15.2%      19550 ±  4%  slabinfo.kmalloc-512.active_objs
     17132 ±  2%     +15.1%      19712 ±  4%  slabinfo.kmalloc-512.num_objs
    133771 ±  3%     -13.6%     115576 ±  9%  slabinfo.kmalloc-8.active_objs
    134332 ±  3%     -13.4%     116281 ±  9%  slabinfo.kmalloc-8.num_objs
   5553071 ±  0%     -11.7%    4903071 ±  0%  proc-vmstat.numa_hit
   5553039 ±  0%     -11.7%    4903039 ±  0%  proc-vmstat.numa_local
   7787767 ±  0%     -11.2%    6913807 ±  0%  proc-vmstat.pgalloc_normal
   5088685 ±  0%     -11.0%    4527573 ±  0%  proc-vmstat.pgfault
   7812098 ±  0%     -11.3%    6930089 ±  0%  proc-vmstat.pgfree
     39721 ± 35%     -46.2%      21378 ± 72%  numa-meminfo.node1.Active
     32736 ± 42%     -54.8%      14793 ±108%  numa-meminfo.node1.Active(anon)
     25797 ± 50%     -61.0%      10062 ±127%  numa-meminfo.node1.AnonHugePages
     32721 ± 43%     -54.8%      14777 ±108%  numa-meminfo.node1.AnonPages
    916.00 ± 14%     -50.5%     453.67 ± 18%  numa-meminfo.node1.PageTables
     15945 ± 11%     +23.8%      19737 ± 10%  numa-meminfo.node2.SReclaimable
    647.67 ± 18%     +60.8%       1041 ± 15%  numa-meminfo.node3.PageTables
      2195 ± 80%    +128.1%       5008 ± 14%  numa-meminfo.node3.Shmem
 5.152e+09 ±  5%      +6.4%  5.481e+09 ±  4%  perf-stat.branch-misses
      0.19 ±  6%      +9.0%       0.20 ±  2%  perf-stat.dTLB-load-miss-rate%
  1.28e+09 ±  1%      -3.7%  1.233e+09 ±  0%  perf-stat.dTLB-load-misses
 6.934e+11 ±  7%     -12.0%  6.101e+11 ±  2%  perf-stat.dTLB-loads
 3.348e+08 ±  0%      -2.3%   3.27e+08 ±  0%  perf-stat.dTLB-store-misses
     84.70 ±  1%      -4.4%      81.00 ±  2%  perf-stat.iTLB-load-miss-rate%
 1.219e+09 ±  1%     -16.5%  1.018e+09 ± 14%  perf-stat.iTLB-load-misses
 2.202e+08 ±  4%      +6.6%  2.348e+08 ±  2%  perf-stat.iTLB-loads
      2102 ± 29%     +39.9%       2941 ± 13%  perf-stat.instructions-per-iTLB-miss
   5064989 ±  0%     -11.1%    4500438 ±  0%  perf-stat.minor-faults
  57745625 ± 10%     -44.9%   31799796 ± 53%  perf-stat.node-store-misses
   9693133 ±  9%     -20.9%    7668803 ± 23%  perf-stat.node-stores
   5064525 ±  0%     -11.1%    4500444 ±  0%  perf-stat.page-faults
      2.35 ± 23%     -24.4%       1.78 ± 14%  perf-profile.calltrace.cycles-pp.__tick_nohz_idle_enter.tick_nohz_irq_exit.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt
      3.00 ± 23%     -26.4%       2.21 ± 14%  perf-profile.calltrace.cycles-pp.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter.call_cpuidle
      2.42 ± 23%     -24.3%       1.83 ± 14%  perf-profile.calltrace.cycles-pp.tick_nohz_irq_exit.irq_exit.smp_apic_timer_interrupt.apic_timer_interrupt.cpuidle_enter
      2.44 ± 23%     -24.5%       1.84 ± 14%  perf-profile.children.cycles-pp.__tick_nohz_idle_enter
      0.96 ± 17%     -24.9%       0.72 ± 19%  perf-profile.children.cycles-pp.copy_user_enhanced_fast_string
      0.72 ± 28%     -27.2%       0.53 ± 14%  perf-profile.children.cycles-pp.hrtimer_start_range_ns
      3.13 ± 23%     -25.1%       2.34 ± 14%  perf-profile.children.cycles-pp.irq_exit
      1.01 ± 76%     -65.2%       0.35 ± 24%  perf-profile.children.cycles-pp.rest_init
      1.01 ± 76%     -65.2%       0.35 ± 24%  perf-profile.children.cycles-pp.start_kernel
      2.49 ± 24%     -24.3%       1.88 ± 13%  perf-profile.children.cycles-pp.tick_nohz_irq_exit
      1.01 ± 76%     -65.2%       0.35 ± 24%  perf-profile.children.cycles-pp.x86_64_start_kernel
      1.01 ± 76%     -65.2%       0.35 ± 24%  perf-profile.children.cycles-pp.x86_64_start_reservations
      2.34 ± 16%     +48.1%       3.47 ± 16%  perf-profile.self.cycles-pp.SYSC_semtimedop
      0.96 ± 17%     -24.9%       0.72 ± 19%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
     33058 ± 17%     +19.8%      39614 ±  1%  numa-vmstat.node0.numa_foreign
   2553587 ±  0%     -38.9%    1560692 ± 59%  numa-vmstat.node0.numa_hit
   2553582 ±  0%     -38.9%    1560686 ± 59%  numa-vmstat.node0.numa_local
     33058 ± 17%     +19.8%      39614 ±  1%  numa-vmstat.node0.numa_miss
      8185 ± 42%     -54.8%       3699 ±108%  numa-vmstat.node1.nr_active_anon
      8179 ± 43%     -54.8%       3693 ±108%  numa-vmstat.node1.nr_anon_pages
    226.33 ± 15%     -50.7%     111.67 ± 19%  numa-vmstat.node1.nr_page_table_pages
      8186 ± 42%     -54.8%       3699 ±108%  numa-vmstat.node1.nr_zone_active_anon
      3986 ± 11%     +23.8%       4933 ± 10%  numa-vmstat.node2.nr_slab_reclaimable
    161.00 ± 18%     +61.1%     259.33 ± 15%  numa-vmstat.node3.nr_page_table_pages
    548.33 ± 80%    +128.3%       1251 ± 14%  numa-vmstat.node3.nr_shmem
     88775 ±  4%      -7.1%      82488 ±  4%  numa-vmstat.node3.numa_foreign
    177250 ±  9%    +378.1%     847409 ±109%  numa-vmstat.node3.numa_hit
    177245 ±  9%    +378.1%     847405 ±109%  numa-vmstat.node3.numa_local
     88775 ±  4%      -7.1%      82488 ±  4%  numa-vmstat.node3.numa_miss
      0.00 ±  0%  +3.7e+09%      36.92 ± 38%  sched_debug.cfs_rq:/.MIN_vruntime.avg
      0.00 ±  0%  +3.6e+11%       3602 ± 15%  sched_debug.cfs_rq:/.MIN_vruntime.max
      0.00 ±  0%  +1.8e+25%     345.99 ± 12%  sched_debug.cfs_rq:/.MIN_vruntime.stddev
    191.05 ±  2%      +9.4%     208.96 ±  4%  sched_debug.cfs_rq:/.load_avg.avg
    573.00 ± 16%     +53.3%     878.22 ± 11%  sched_debug.cfs_rq:/.load_avg.max
     49.42 ± 14%     +51.3%      74.77 ±  9%  sched_debug.cfs_rq:/.load_avg.stddev
      0.00 ±  0%  +3.7e+09%      36.92 ± 38%  sched_debug.cfs_rq:/.max_vruntime.avg
      0.00 ±  0%  +3.6e+11%       3602 ± 15%  sched_debug.cfs_rq:/.max_vruntime.max
      0.00 ±  0%  +1.8e+25%     345.99 ± 12%  sched_debug.cfs_rq:/.max_vruntime.stddev
    849009 ±  9%     -36.3%     540564 ± 25%  sched_debug.cfs_rq:/.min_vruntime.max
     69571 ±  9%     -36.5%      44149 ± 26%  sched_debug.cfs_rq:/.min_vruntime.stddev
      0.19 ± 40%     -57.0%       0.08 ± 89%  sched_debug.cfs_rq:/.nr_spread_over.stddev
      2.85 ± 29%     +71.1%       4.87 ± 11%  sched_debug.cfs_rq:/.runnable_load_avg.avg
    313.78 ± 36%     +96.8%     617.61 ± 14%  sched_debug.cfs_rq:/.runnable_load_avg.max
     27.10 ± 35%     +90.5%      51.62 ± 13%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
    813659 ±  9%     -38.0%     504715 ± 27%  sched_debug.cfs_rq:/.spread0.max
     69571 ±  9%     -36.5%      44151 ± 26%  sched_debug.cfs_rq:/.spread0.stddev
     65.19 ±  9%     -62.2%      24.62 ± 27%  sched_debug.cpu.clock.stddev
     65.19 ±  9%     -62.2%      24.62 ± 27%  sched_debug.cpu.clock_task.stddev
      2.84 ± 29%     +70.4%       4.85 ± 11%  sched_debug.cpu.cpu_load[0].avg
    313.78 ± 36%     +96.8%     617.61 ± 14%  sched_debug.cpu.cpu_load[0].max
     27.10 ± 35%     +90.5%      51.61 ± 13%  sched_debug.cpu.cpu_load[0].stddev
      3.00 ± 34%     +73.6%       5.21 ±  7%  sched_debug.cpu.cpu_load[1].avg
    312.39 ± 36%    +102.1%     631.44 ± 12%  sched_debug.cpu.cpu_load[1].max
     27.24 ± 36%     +94.8%      53.07 ± 11%  sched_debug.cpu.cpu_load[1].stddev
      2.88 ± 34%     +78.3%       5.13 ±  8%  sched_debug.cpu.cpu_load[2].avg
    310.06 ± 37%    +104.0%     632.61 ± 12%  sched_debug.cpu.cpu_load[2].max
     26.75 ± 36%     +98.8%      53.18 ± 11%  sched_debug.cpu.cpu_load[2].stddev
      2.68 ± 34%     +86.8%       5.01 ±  9%  sched_debug.cpu.cpu_load[3].avg
    303.11 ± 37%    +110.0%     636.39 ± 12%  sched_debug.cpu.cpu_load[3].max
     25.82 ± 36%    +106.2%      53.24 ± 11%  sched_debug.cpu.cpu_load[3].stddev
      2.44 ± 33%     +99.3%       4.87 ±  9%  sched_debug.cpu.cpu_load[4].avg
    289.67 ± 36%    +119.5%     635.78 ± 12%  sched_debug.cpu.cpu_load[4].max
     24.41 ± 35%    +117.4%      53.07 ± 11%  sched_debug.cpu.cpu_load[4].stddev
      0.00 ± 12%     -54.4%       0.00 ± 19%  sched_debug.cpu.next_balance.stddev
     44337 ± 46%     +66.1%      73663 ±  4%  sched_debug.cpu.nr_switches.max
      5304 ± 31%     +34.0%       7105 ±  5%  sched_debug.cpu.nr_switches.stddev
    -21.06 ±-17%     -38.5%     -12.94 ±-35%  sched_debug.cpu.nr_uninterruptible.min
     20897 ± 48%     +72.1%      35970 ±  4%  sched_debug.cpu.sched_goidle.max
      2524 ± 33%     +37.8%       3478 ±  5%  sched_debug.cpu.sched_goidle.stddev
     21067 ± 45%     +88.3%      39667 ± 11%  sched_debug.cpu.ttwu_count.max
      2771 ± 34%     +39.0%       3851 ±  8%  sched_debug.cpu.ttwu_count.stddev
     10886 ± 41%    +117.4%      23669 ± 14%  sched_debug.cpu.ttwu_local.max
      1101 ± 31%     +84.7%       2033 ± 15%  sched_debug.cpu.ttwu_local.stddev



                                perf-stat.page-faults

  6e+06 ++------------------------------------------------------------------+
        |                                                                   |
  5e+06 **.****.***.****.***.****.***.****.****.***.****.***.****.***.** *.**
        OO OO O OOO OOOO OOO OOOO OOO OOOO  OOO OOO OO O   O           : :  |
        |    O                             O          O   O            : :  |
  4e+06 ++                                                             : :  |
        |                                                              : :  |
  3e+06 ++                                                             : :  |
        |                                                               ::  |
  2e+06 ++                                                              :   |
        |                                                               :   |
        |                                                               :   |
  1e+06 ++                                                              :   |
        |                                                               :   |
      0 ++-----------------------------------------------O--------------*---+


                               perf-stat.minor-faults

  6e+06 ++------------------------------------------------------------------+
        |                                                                   |
  5e+06 **.****.***.****.***.****.***.****.****.***.****.***.****.***.** *.**
        OO OO O OOO OOOO OOO OOOO OOO OOOO  OOO OOO OO O   O           : :  |
        |    O                             O          O   O            : :  |
  4e+06 ++                                                             : :  |
        |                                                              : :  |
  3e+06 ++                                                             : :  |
        |                                                               ::  |
  2e+06 ++                                                              :   |
        |                                                               :   |
        |                                                               :   |
  1e+06 ++                                                              :   |
        |                                                               :   |
      0 ++-----------------------------------------------O--------------*---+


                            aim9.shared_memory.ops_per_sec

  1.6e+06 ++----------------------------------------------------------------+
          ** .**  .*** .** *.* **. ***.****.****.* **. ** .** *.****.* *  ***
  1.4e+06 ++*   **    *   *   *   *               *   *  *   *        *:  : |
          OOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOO  OO          :  : |
  1.2e+06 ++                                                           :  : |
    1e+06 ++                                                           : :  |
          |                                                            : :  |
   800000 ++                                                           : :  |
          |                                                             ::  |
   600000 ++                                                            ::  |
   400000 ++                                                            ::  |
          |                                                             :   |
   200000 ++                                                            :   |
          |                                                             :   |
        0 ++---------------------------------------------O--------------*---+


                              aim9.time.minor_page_faults

  4.5e+06 ++-----------------------*----***-**-------------------*-*-*-*--***
          ***.****.****.****.****.* **.*      **.****.****.****.* *   *:  : |
    4e+06 OOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOOO OOO  OO          :  : |
  3.5e+06 ++                                                           :  : |
          |                                                            : :  |
    3e+06 ++                                                           : :  |
  2.5e+06 ++                                                           : :  |
          |                                                             ::  |
    2e+06 ++                                                            ::  |
  1.5e+06 ++                                                            ::  |
          |                                                             ::  |
    1e+06 ++                                                            :   |
   500000 ++                                                            :   |
          |                                                             :   |
        0 ++---------------------------------------------O--------------*---+

	[*] bisect-good sample
	[O] bisect-bad  sample





Thanks,
Xiaolong

View attachment "config-4.8.0-11974-g5864a2f" of type "text/plain" (153666 bytes)

View attachment "job-script" of type "text/plain" (6516 bytes)

View attachment "job.yaml" of type "text/plain" (4084 bytes)

View attachment "reproduce" of type "text/plain" (103 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ