lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202211281010.d5c055e1-oliver.sang@intel.com>
Date:   Mon, 28 Nov 2022 16:33:42 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     John Ogness <john.ogness@...utronix.de>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        Petr Mladek <pmladek@...e.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Miguel Ojeda <ojeda@...nel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Linux Memory Management List <linux-mm@...ck.org>,
        <linux-kernel@...r.kernel.org>, <ying.huang@...el.com>,
        <feng.tang@...el.com>, <zhengjun.xing@...ux.intel.com>,
        <fengwei.yin@...el.com>
Subject: [linux-next:master] [printk]  8bdbdd7f43:
 will-it-scale.per_process_ops 3.9% improvement


please be noted we didn't figure out the profiling data connecting to this
improvment, but in our tests, the data is very stable.

for this commit:
  "will-it-scale.per_process_ops": [
    75636,
    75623,
    75642,
    75613,
    75592,
    75628,
    74016,
    75637,
    75622,
    75617,
    75605,
    74013,
    75629,
    75618,
    75595,
    75614,
    75611,
    75619,
    75629,
    75619
  ],

for parent:
  "will-it-scale.per_process_ops": [
    72665,
    72628,
    72665,
    72656,
    72668,
    72642,
    72660,
    72642,
    72648,
    72661,
    72650,
    72648,
    72651,
    72639,
    72630,
    72641,
    72655,
    72650,
    72624,
    72649
  ],


and thanks a lot Fengwei (Cced) helped review and made below comments:
"This patch could bring performance improvement. It use RCU lock to replace
mutex for some consoles. I expect some lock contention reduction. Didn't find
from perf calltrace profiling. Could see some in perf self profiling. But
without calltrace, we don't know who is the owner of the lock."


so we made below report FYI.


Greeting,

FYI, we noticed a 3.9% improvement of will-it-scale.per_process_ops due to commit:


commit: 8bdbdd7f43cd74c7faca6add8a62d541503ae21d ("printk: Prepare for SRCU console list protection")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master

in testcase: will-it-scale
on test machine: 128 threads 4 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory
with following parameters:

	nr_task: 50%
	mode: process
	test: open1
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-11/performance/x86_64-rhel-8.3/process/50%/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp2/open1/will-it-scale

commit: 
  318eb6d938 ("printk: Convert console_drivers list to hlist")
  8bdbdd7f43 ("printk: Prepare for SRCU console list protection")

318eb6d938484a5a 8bdbdd7f43cd74c7faca6add8a6 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   4649546            +3.9%    4829404        will-it-scale.64.processes
     72648            +3.9%      75458        will-it-scale.per_process_ops
   4649546            +3.9%    4829404        will-it-scale.workload
    346119           +12.7%     390049        meminfo.SUnreclaim
      3480           -22.8%       2688 ±  2%  vmstat.system.cs
     86566 ±  7%     +31.1%     113460 ±  5%  numa-meminfo.node0.SUnreclaim
     65513 ± 13%     +40.6%      92118 ±  6%  numa-meminfo.node1.SUnreclaim
     86566           +12.7%      97547        proc-vmstat.nr_slab_unreclaimable
  20789147            +6.0%   22031599        proc-vmstat.numa_hit
  20615296            +6.0%   21857885        proc-vmstat.numa_local
  80423862            +6.2%   85382185        proc-vmstat.pgalloc_normal
  80433923            +6.2%   85390274        proc-vmstat.pgfree
   8319772 ±  6%     -20.4%    6621861 ±  9%  sched_debug.cfs_rq:/.min_vruntime.max
   1246763 ± 12%     -46.0%     673096 ± 12%  sched_debug.cfs_rq:/.min_vruntime.stddev
  -3067572           -58.9%   -1261852        sched_debug.cfs_rq:/.spread0.min
   1246909 ± 12%     -46.0%     673159 ± 12%  sched_debug.cfs_rq:/.spread0.stddev
      6632 ±  3%     -15.0%       5638 ±  5%  sched_debug.cpu.nr_switches.avg
      2796 ±  9%     -22.2%       2176 ±  9%  sched_debug.cpu.nr_switches.min
   1594604          +206.3%    4884774        numa-numastat.node0.local_node
   1644243          +199.8%    4929305        numa-numastat.node0.numa_hit
   1592483          +206.3%    4878540        numa-numastat.node1.local_node
   1638420          +200.1%    4916699        numa-numastat.node1.numa_hit
   8690370           -30.6%    6028530        numa-numastat.node2.local_node
   8729078           -30.4%    6074883        numa-numastat.node2.numa_hit
   8734995           -30.6%    6063318        numa-numastat.node3.local_node
   8774563           -30.4%    6107989        numa-numastat.node3.numa_hit
     21691 ±  7%     +32.4%      28715 ±  5%  numa-vmstat.node0.nr_slab_unreclaimable
   1644113          +199.8%    4929234        numa-vmstat.node0.numa_hit
   1594474          +206.4%    4884703        numa-vmstat.node0.numa_local
     16427 ± 12%     +42.0%      23321 ±  7%  numa-vmstat.node1.nr_slab_unreclaimable
   1638283          +200.1%    4916648        numa-vmstat.node1.numa_hit
   1592346          +206.4%    4878490        numa-vmstat.node1.numa_local
   8728930           -30.4%    6074798        numa-vmstat.node2.numa_hit
   8690222           -30.6%    6028445        numa-vmstat.node2.numa_local
   8774491           -30.4%    6108017        numa-vmstat.node3.numa_hit
   8734922           -30.6%    6063345        numa-vmstat.node3.numa_local
     43.31            -0.9       42.42        perf-profile.calltrace.cycles-pp.security_file_open.do_dentry_open.do_open.path_openat.do_filp_open
     43.27            -0.9       42.38        perf-profile.calltrace.cycles-pp.apparmor_file_open.security_file_open.do_dentry_open.do_open.path_openat
     43.85            -0.9       42.97        perf-profile.calltrace.cycles-pp.do_dentry_open.do_open.path_openat.do_filp_open.do_sys_openat2
     44.30            -0.9       43.41        perf-profile.calltrace.cycles-pp.do_open.path_openat.do_filp_open.do_sys_openat2.__x64_sys_openat
      0.63 ±  6%      +0.2        0.80 ± 28%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.apparmor_file_open.security_file_open.do_dentry_open.do_open
      0.59 ±  6%      +0.2        0.77 ± 29%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.apparmor_file_open.security_file_open.do_dentry_open
     43.33            -0.9       42.43        perf-profile.children.cycles-pp.security_file_open
     43.31            -0.9       42.41        perf-profile.children.cycles-pp.apparmor_file_open
     43.89            -0.9       43.00        perf-profile.children.cycles-pp.do_dentry_open
     44.32            -0.9       43.43        perf-profile.children.cycles-pp.do_open
      0.24 ±  9%      -0.1        0.16 ± 11%  perf-profile.children.cycles-pp.menu_select
      0.19 ± 10%      -0.1        0.12 ± 15%  perf-profile.children.cycles-pp.tick_nohz_get_sleep_length
      0.14 ±  9%      -0.1        0.08 ± 11%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.08 ± 10%      -0.0        0.03 ± 82%  perf-profile.children.cycles-pp.get_next_timer_interrupt
      0.11 ± 13%      -0.0        0.08 ± 18%  perf-profile.children.cycles-pp.tick_nohz_next_event
      0.12 ±  7%      +0.0        0.15 ±  4%  perf-profile.children.cycles-pp.shuffle_freelist
      0.14 ±  7%      +0.0        0.18 ±  2%  perf-profile.children.cycles-pp.allocate_slab
      0.20 ±  5%      +0.0        0.25 ±  3%  perf-profile.children.cycles-pp.___slab_alloc
     42.69            -1.1       41.62 ±  2%  perf-profile.self.cycles-pp.apparmor_file_open
      0.66 ± 11%      -0.5        0.16 ± 10%  perf-profile.self.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.14 ±  9%      -0.1        0.08 ± 13%  perf-profile.self.cycles-pp._raw_spin_lock_irqsave
      0.10 ±  7%      +0.0        0.13 ±  3%  perf-profile.self.cycles-pp.shuffle_freelist
      2.99            +3.5%       3.09        perf-stat.i.MPKI
 5.498e+09            +3.8%  5.706e+09        perf-stat.i.branch-instructions
     34.60            -1.7       32.95        perf-stat.i.cache-miss-rate%
  28569086            +2.3%   29217344        perf-stat.i.cache-misses
  82561656            +7.4%   88636951        perf-stat.i.cache-references
      3383           -23.5%       2587 ±  2%  perf-stat.i.context-switches
    224.62 ±  3%      -7.1%     208.68 ±  3%  perf-stat.i.cpu-migrations
 8.243e+09            +3.7%  8.545e+09        perf-stat.i.dTLB-loads
 4.682e+09            +3.6%  4.849e+09        perf-stat.i.dTLB-stores
 2.764e+10            +3.8%  2.868e+10        perf-stat.i.instructions
    785.42            +5.8%     831.18        perf-stat.i.metric.K/sec
    143.88            +3.7%     149.17        perf-stat.i.metric.M/sec
   6893040            +4.8%    7226385        perf-stat.i.node-load-misses
    206202 ±  2%     +10.1%     227052        perf-stat.i.node-loads
     76.95            -2.8       74.10        perf-stat.i.node-store-miss-rate%
   8358863            -7.7%    7714585        perf-stat.i.node-store-misses
   2506806 ±  2%      +7.6%    2698200 ±  2%  perf-stat.i.node-stores
      2.99            +3.5%       3.09        perf-stat.overall.MPKI
     34.61            -1.6       32.99        perf-stat.overall.cache-miss-rate%
     76.93            -2.8       74.09        perf-stat.overall.node-store-miss-rate%
 5.479e+09            +3.8%  5.687e+09        perf-stat.ps.branch-instructions
  28481165            +2.3%   29144556        perf-stat.ps.cache-misses
  82289248            +7.4%   88361058        perf-stat.ps.cache-references
      3371           -23.6%       2577 ±  2%  perf-stat.ps.context-switches
    223.97 ±  3%      -7.0%     208.27 ±  3%  perf-stat.ps.cpu-migrations
 8.214e+09            +3.7%  8.517e+09        perf-stat.ps.dTLB-loads
 4.665e+09            +3.6%  4.833e+09        perf-stat.ps.dTLB-stores
 2.754e+10            +3.8%  2.858e+10        perf-stat.ps.instructions
   6871767            +4.9%    7207616        perf-stat.ps.node-load-misses
    205592 ±  2%     +10.1%     226439        perf-stat.ps.node-loads
   8330952            -7.7%    7692044        perf-stat.ps.node-store-misses
   2498719 ±  2%      +7.7%    2690666 ±  2%  perf-stat.ps.node-stores
  8.33e+12            +3.9%  8.654e+12        perf-stat.total.instructions




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-6.1.0-rc1-00015-g8bdbdd7f43cd" of type "text/plain" (166072 bytes)

View attachment "job-script" of type "text/plain" (7978 bytes)

View attachment "job.yaml" of type "text/plain" (5554 bytes)

View attachment "reproduce" of type "text/plain" (345 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ