lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20200721001536.GE19262@shao2-debian>
Date:   Tue, 21 Jul 2020 08:15:36 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Jan Kara <jack@...e.cz>, Amir Goldstein <amir73il@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [fsnotify] 71d734103e: will-it-scale.per_process_ops 4.2% improvement

Greeting,

FYI, we noticed a 4.2% improvement of will-it-scale.per_process_ops due to commit:


commit: 71d734103edfa2b4c6657578a3082ee0e51d767e ("fsnotify: Rearrange fast path to minimise overhead when there is no watcher")
https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master


in testcase: will-it-scale
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	nr_task: 50%
	mode: process
	test: eventfd1
	cpufreq_governor: performance
	ucode: 0x5002f01

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale

In addition to that, the commit also has significant impact on the following tests:

+------------------+---------------------------------------------------------------------------+
| testcase: change | will-it-scale: will-it-scale.per_process_ops 6.4% improvement             |
| test machine     | 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory |
| test parameters  | cpufreq_governor=performance                                              |
|                  | mode=process                                                              |
|                  | nr_task=100%                                                              |
|                  | test=eventfd1                                                             |
|                  | ucode=0x5002f01                                                           |
+------------------+---------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap3/eventfd1/will-it-scale/0x5002f01

commit: 
  47aaabdedf ("fanotify: Avoid softlockups when reading many events")
  71d734103e ("fsnotify: Rearrange fast path to minimise overhead when there is no watcher")

47aaabdedf366ac5 71d734103edfa2b4c6657578a30 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   2791175            +4.2%    2909227        will-it-scale.per_process_ops
  2.68e+08            +4.2%  2.793e+08        will-it-scale.workload
     30001            -1.7%      29506        proc-vmstat.nr_kernel_stack
     23.08 ± 10%     -16.0%      19.39 ± 17%  sched_debug.cfs_rq:/.load_avg.stddev
    157.50 ±102%   +1470.3%       2473 ± 58%  numa-vmstat.node2.nr_inactive_anon
    205.75 ± 82%   +1149.5%       2570 ± 55%  numa-vmstat.node2.nr_shmem
    157.50 ±102%   +1470.3%       2473 ± 58%  numa-vmstat.node2.nr_zone_inactive_anon
      8361 ±  6%     -14.2%       7175 ±  7%  numa-vmstat.node3.nr_kernel_stack
    668.50 ± 92%   +1380.3%       9896 ± 58%  numa-meminfo.node2.Inactive
    631.50 ±101%   +1466.9%       9895 ± 58%  numa-meminfo.node2.Inactive(anon)
    773835 ±  5%     +12.9%     873656 ±  4%  numa-meminfo.node2.MemUsed
    826.00 ± 81%   +1145.2%      10285 ± 55%  numa-meminfo.node2.Shmem
      8359 ±  5%     -14.1%       7177 ±  7%  numa-meminfo.node3.KernelStack
    992.25 ± 74%     -56.0%     436.50 ± 92%  interrupts.33:PCI-MSI.524291-edge.eth0-TxRx-2
     15391 ±122%     -88.5%       1762 ±127%  interrupts.CPU116.LOC:Local_timer_interrupts
    992.25 ± 74%     -56.0%     436.50 ± 92%  interrupts.CPU12.33:PCI-MSI.524291-edge.eth0-TxRx-2
     84835 ±146%     -96.9%       2626 ±149%  interrupts.CPU146.LOC:Local_timer_interrupts
      3971 ±129%    +237.8%      13415 ± 24%  interrupts.CPU148.LOC:Local_timer_interrupts
     80671 ±156%     -98.9%     916.50 ± 81%  interrupts.CPU15.LOC:Local_timer_interrupts
      9204 ± 51%   +1633.6%     159576 ± 87%  interrupts.CPU168.LOC:Local_timer_interrupts
     82118 ±153%     -96.8%       2591 ±102%  interrupts.CPU172.LOC:Local_timer_interrupts
    154291 ± 94%     -96.8%       4902 ±142%  interrupts.CPU186.LOC:Local_timer_interrupts
     12237 ± 95%     -88.9%       1353 ±132%  interrupts.CPU19.LOC:Local_timer_interrupts
    467.00 ± 14%     -15.5%     394.50        interrupts.CPU190.CAL:Function_call_interrupts
      3878 ± 71%   +4172.3%     165689 ± 81%  interrupts.CPU55.LOC:Local_timer_interrupts
     80834 ±156%     -98.5%       1228 ± 77%  interrupts.CPU59.LOC:Local_timer_interrupts
    154405 ± 94%     -95.8%       6436 ±133%  interrupts.CPU61.LOC:Local_timer_interrupts
     14356 ± 47%     -33.7%       9516 ±  8%  softirqs.CPU116.TIMER
     35187 ±109%     -70.6%      10336 ± 22%  softirqs.CPU146.TIMER
      9840 ± 17%     +46.3%      14393 ± 21%  softirqs.CPU148.TIMER
      4744 ± 99%     -70.2%       1413 ±  3%  softirqs.CPU15.RCU
     35797 ±110%     -71.7%      10113 ±  2%  softirqs.CPU15.TIMER
     11678 ± 15%    +408.6%      59398 ± 74%  softirqs.CPU168.TIMER
      3735 ±123%     -81.8%     678.25 ± 34%  softirqs.CPU172.RCU
     35438 ±116%     -73.5%       9401 ± 13%  softirqs.CPU172.TIMER
      6229 ± 84%     -87.0%     810.00 ± 44%  softirqs.CPU186.RCU
     57680 ± 80%     -82.9%       9838 ± 22%  softirqs.CPU186.TIMER
     14204 ± 29%     -27.7%      10271 ±  6%  softirqs.CPU19.TIMER
     36486 ±102%     -65.0%      12755 ± 38%  softirqs.CPU21.TIMER
      5740 ±108%     -70.1%       1715 ± 35%  softirqs.CPU31.RCU
     38176 ±103%     -66.4%      12835 ± 31%  softirqs.CPU31.TIMER
      1353 ±  7%    +488.9%       7968 ± 65%  softirqs.CPU55.RCU
     10688 ±  7%    +483.8%      62396 ± 68%  softirqs.CPU55.TIMER
     36343 ±116%     -72.4%      10035 ±  2%  softirqs.CPU59.TIMER
      7423 ± 77%     -80.2%       1470 ± 31%  softirqs.CPU61.RCU
     58969 ± 78%     -80.4%      11528 ± 24%  softirqs.CPU61.TIMER
     11012 ±  7%     +29.5%      14256 ± 12%  softirqs.CPU87.TIMER


                                                                                
                             will-it-scale.per_process_ops                      
                                                                                
  2.96e+06 +----------------------------------------------------------------+   
  2.94e+06 |-+                                                  O  O        |   
           |                                                                |   
  2.92e+06 |-+                                                O      O O  O |   
   2.9e+06 |-+                                    O  O O O  O               |   
  2.88e+06 |-+                                O                             |   
  2.86e+06 |-O  O O O  O O O  O      O O O  O   O                           |   
           |                      O                                         |   
  2.84e+06 |-+                  O                                           |   
  2.82e+06 |-+                                                              |   
   2.8e+06 |-+                                                              |   
  2.78e+06 |.+..+.+.+..+.+.                           .+.+..+.+             |   
           |               +..+.+.+..            .+..+                      |   
  2.76e+06 |-+                       +.+.+..+.+.+                           |   
  2.74e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-csl-2ap4: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-csl-2ap4/eventfd1/will-it-scale/0x5002f01

commit: 
  47aaabdedf ("fanotify: Avoid softlockups when reading many events")
  71d734103e ("fsnotify: Rearrange fast path to minimise overhead when there is no watcher")

47aaabdedf366ac5 71d734103edfa2b4c6657578a30 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   1651925            +6.4%    1758026        will-it-scale.per_process_ops
 3.172e+08            +6.4%  3.375e+08        will-it-scale.workload
      1730           -75.0%     431.75 ±173%  meminfo.Mlocked
     15.00            +6.7%      16.00        vmstat.cpu.us
    815508 ±  4%      -8.8%     744090 ±  3%  sched_debug.cfs_rq:/.spread0.max
      0.12 ± 11%     -19.7%       0.09 ± 15%  sched_debug.cpu.nr_running.stddev
    589.75 ±  4%     +18.3%     697.75 ±  6%  slabinfo.skbuff_fclone_cache.active_objs
    589.75 ±  4%     +18.3%     697.75 ±  6%  slabinfo.skbuff_fclone_cache.num_objs
    109.25 ± 61%     -78.9%      23.00 ± 87%  numa-numastat.node1.other_node
     72.00 ± 96%     -95.8%       3.00 ± 23%  numa-numastat.node2.other_node
    101.00 ± 84%     -90.6%       9.50 ±161%  numa-numastat.node3.other_node
      7551            -1.0%       7478        proc-vmstat.nr_mapped
    432.50           -75.1%     107.75 ±173%  proc-vmstat.nr_mlock
     88431            -1.1%      87446        proc-vmstat.nr_slab_unreclaimable
    560.25 ± 28%     -29.9%     393.00        interrupts.CPU105.CAL:Function_call_interrupts
      3962 ± 67%     -53.3%       1850 ± 34%  interrupts.CPU145.CAL:Function_call_interrupts
      1432 ± 49%    +149.3%       3572 ± 21%  interrupts.CPU49.CAL:Function_call_interrupts
    551.50 ± 35%     -28.6%     393.75        interrupts.CPU71.CAL:Function_call_interrupts
     12597 ±  6%     +12.5%      14177 ±  3%  softirqs.CPU130.RCU
     12676 ±  2%     +16.8%      14811 ± 10%  softirqs.CPU131.RCU
     13967            +8.0%      15088 ±  4%  softirqs.CPU146.RCU
     12317 ±  3%     +10.4%      13596 ±  5%  softirqs.CPU191.RCU
     14202 ±  3%     +11.1%      15777 ±  7%  softirqs.CPU24.RCU
     13554 ±  2%     +11.5%      15112 ±  4%  softirqs.CPU46.RCU
     92735 ± 61%     +71.3%     158890 ± 36%  numa-meminfo.node0.Active
     90536 ± 62%     +73.9%     157416 ± 38%  numa-meminfo.node0.Active(anon)
     37320 ±102%    +143.2%      90748 ± 51%  numa-meminfo.node0.AnonHugePages
     89851 ± 62%     +75.0%     157225 ± 38%  numa-meminfo.node0.AnonPages
      7340 ± 53%     -64.7%       2592 ±148%  numa-meminfo.node0.Inactive
      7157 ± 54%     -65.7%       2452 ±150%  numa-meminfo.node0.Inactive(anon)
      8363 ± 23%     -29.0%       5936 ± 26%  numa-meminfo.node0.Mapped
    101341 ±  7%      -9.3%      91953 ±  3%  numa-meminfo.node0.SUnreclaim
      7970 ± 51%     -66.0%       2708 ±148%  numa-meminfo.node0.Shmem
    138005 ±  7%     -11.1%     122678 ±  6%  numa-meminfo.node0.Slab
     22635 ± 62%     +73.9%      39357 ± 38%  numa-vmstat.node0.nr_active_anon
     22463 ± 62%     +75.0%      39309 ± 38%  numa-vmstat.node0.nr_anon_pages
      1789 ± 54%     -65.7%     612.75 ±150%  numa-vmstat.node0.nr_inactive_anon
      2090 ± 23%     -29.0%       1483 ± 26%  numa-vmstat.node0.nr_mapped
    106.50 ± 23%     -78.6%      22.75 ±173%  numa-vmstat.node0.nr_mlock
      1992 ± 51%     -66.0%     676.75 ±148%  numa-vmstat.node0.nr_shmem
     25335 ±  7%      -9.3%      22988 ±  3%  numa-vmstat.node0.nr_slab_unreclaimable
     22635 ± 62%     +73.9%      39357 ± 38%  numa-vmstat.node0.nr_zone_active_anon
      1789 ± 54%     -65.7%     612.75 ±150%  numa-vmstat.node0.nr_zone_inactive_anon
    107.25 ± 22%     -78.8%      22.75 ±173%  numa-vmstat.node1.nr_mlock
    108.00 ± 22%     -78.5%      23.25 ±173%  numa-vmstat.node3.nr_mlock





Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.8.0-rc4-00084-g71d734103edfa" of type "text/plain" (158415 bytes)

View attachment "job-script" of type "text/plain" (7487 bytes)

View attachment "job.yaml" of type "text/plain" (5097 bytes)

View attachment "reproduce" of type "text/plain" (340 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ