lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 18 Feb 2019 10:16:46 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     john.hubbard@...il.com
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        Al Viro <viro@...iv.linux.org.uk>,
        Christian Benvenuti <benve@...co.com>,
        Christoph Hellwig <hch@...radead.org>,
        Christopher Lameter <cl@...ux.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dave Chinner <david@...morbit.com>,
        Dennis Dalessandro <dennis.dalessandro@...el.com>,
        Doug Ledford <dledford@...hat.com>, Jan Kara <jack@...e.cz>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Jerome Glisse <jglisse@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Michal Hocko <mhocko@...nel.org>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Mike Marciniszyn <mike.marciniszyn@...el.com>,
        Ralph Campbell <rcampbell@...dia.com>,
        Tom Talpey <tom@...pey.com>,
        LKML <linux-kernel@...r.kernel.org>,
        linux-fsdevel@...r.kernel.org, John Hubbard <jhubbard@...dia.com>,
        lkp@...org
Subject: [LKP] [mm/gup]  e7ae097b0b:  will-it-scale.per_process_ops -5.0%
 regression

Greeting,

FYI, we noticed a -5.0% regression of will-it-scale.per_process_ops due to commit:


commit: e7ae097b0bda3e7dfd224e2a960346c37aa42394 ("[PATCH 5/6] mm/gup: /proc/vmstat support for get/put user pages")
url: https://github.com/0day-ci/linux/commits/john-hubbard-gmail-com/RFC-v2-mm-gup-dma-tracking/20190205-001101


in testcase: will-it-scale
on test machine: 192 threads Skylake-4S with 704G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: futex1
	cpufreq_governor: performance

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase:
  gcc-7/performance/x86_64-rhel-7.2/process/100%/debian-x86_64-2018-04-03.cgz/lkp-skl-4sp1/futex1/will-it-scale

commit: 
  cdaa813278 ("mm/gup: track gup-pinned pages")
  e7ae097b0b ("mm/gup: /proc/vmstat support for get/put user pages")

cdaa813278ddc616 e7ae097b0bda3e7dfd224e2a96 
---------------- -------------------------- 
         %stddev     %change         %stddev
             \          |                \  
   1808394            -5.0%    1717524        will-it-scale.per_process_ops
 3.472e+08            -5.0%  3.298e+08        will-it-scale.workload
      7936            +3.0%       8174        vmstat.system.cs
    546083 ± 30%     -41.7%     318102 ± 68%  numa-numastat.node2.local_node
    576076 ± 27%     -40.3%     343781 ± 60%  numa-numastat.node2.numa_hit
     27067            -4.5%      25852 ±  2%  proc-vmstat.nr_shmem
     30283            -6.3%      28378 ±  3%  proc-vmstat.pgactivate
     18458 ±  5%      +9.0%      20120 ±  5%  slabinfo.kmalloc-96.active_objs
     19055 ±  4%      +7.7%      20521 ±  5%  slabinfo.kmalloc-96.num_objs
      1654 ±  5%     +13.5%       1877 ±  3%  slabinfo.pool_workqueue.active_objs
      1747 ±  5%     +10.9%       1937 ±  3%  slabinfo.pool_workqueue.num_objs
     17.13 ±  3%      +2.3       19.45 ±  2%  perf-profile.calltrace.cycles-pp.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wake.do_futex
      0.00            +2.4        2.40 ±  9%  perf-profile.calltrace.cycles-pp.mod_node_page_state.gup_pgd_range.get_user_pages_fast.get_futex_key.futex_wake
     19.62 ±  3%      +2.4       22.06 ±  2%  perf-profile.calltrace.cycles-pp.get_user_pages_fast.get_futex_key.futex_wake.do_futex.__x64_sys_futex
     26.50 ±  3%      +2.8       29.33 ±  2%  perf-profile.calltrace.cycles-pp.get_futex_key.futex_wake.do_futex.__x64_sys_futex.do_syscall_64
     32.48 ±  3%      +3.0       35.52 ±  2%  perf-profile.calltrace.cycles-pp.futex_wake.do_futex.__x64_sys_futex.do_syscall_64.entry_SYSCALL_64_after_hwframe
     46.45 ±100%     -67.5%      15.08 ± 22%  sched_debug.cfs_rq:/.load_avg.stddev
      0.04 ±  5%     -11.4%       0.04 ±  6%  sched_debug.cfs_rq:/.nr_running.stddev
     13.08 ±  4%     +18.8%      15.54 ±  5%  sched_debug.cfs_rq:/.runnable_load_avg.max
      1.14 ±  4%     +18.6%       1.35 ±  5%  sched_debug.cfs_rq:/.runnable_load_avg.stddev
     13543 ±  7%      +9.5%      14825 ±  4%  sched_debug.cfs_rq:/.runnable_weight.max
      1267 ±  5%     +13.4%       1436 ±  6%  sched_debug.cfs_rq:/.runnable_weight.stddev
      1089 ±  3%     +14.4%       1246 ±  7%  sched_debug.cfs_rq:/.util_avg.max
    533.25 ±  7%      +7.7%     574.21 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.max
      1.17 ±  6%     +18.1%       1.39 ±  6%  sched_debug.cpu.cpu_load[0].stddev
      3321            +9.1%       3622 ±  4%  sched_debug.cpu.curr->pid.min
      1355 ±  5%     +10.1%       1493 ±  6%  sched_debug.cpu.load.stddev
      1399 ±  6%     -11.5%       1239 ±  3%  sched_debug.cpu.nr_load_updates.stddev
    270.33 ±  8%     +52.7%     412.92 ± 14%  sched_debug.cpu.nr_switches.min
    113.58 ±  3%     +75.1%     198.88 ±  6%  sched_debug.cpu.sched_count.min
      6331 ± 17%     +41.5%       8956 ±  9%  sched_debug.cpu.sched_goidle.max
    545.75 ±  5%     +24.9%     681.80 ±  9%  sched_debug.cpu.sched_goidle.stddev
     89.38           +48.7%     132.92 ±  3%  sched_debug.cpu.ttwu_count.min
     48.21 ±  2%     +82.6%      88.04 ±  6%  sched_debug.cpu.ttwu_local.min
     47917 ± 29%     -71.3%      13757 ± 83%  numa-vmstat.node0.nr_active_anon
     30393 ± 45%     -71.0%       8807 ±111%  numa-vmstat.node0.nr_anon_pages
     78776 ± 10%     -24.6%      59392 ±  7%  numa-vmstat.node0.nr_file_pages
      9592 ±  8%     -10.3%       8602 ±  7%  numa-vmstat.node0.nr_kernel_stack
      2706 ± 37%     -32.6%       1824 ± 38%  numa-vmstat.node0.nr_mapped
     18243 ± 45%     -90.1%       1803 ±118%  numa-vmstat.node0.nr_shmem
      9147 ± 19%     -42.9%       5227 ± 42%  numa-vmstat.node0.nr_slab_reclaimable
     18959 ±  7%     -17.8%      15594 ± 14%  numa-vmstat.node0.nr_slab_unreclaimable
     47917 ± 29%     -71.3%      13758 ± 83%  numa-vmstat.node0.nr_zone_active_anon
     34497 ± 41%     +54.6%      53318 ± 19%  numa-vmstat.node1.nr_active_anon
     60479 ±  6%      +9.6%      66283 ±  9%  numa-vmstat.node1.nr_file_pages
      1419 ±122%    +162.7%       3729 ± 45%  numa-vmstat.node1.nr_inactive_anon
      1571 ±112%    +441.7%       8510 ± 73%  numa-vmstat.node1.nr_shmem
     34497 ± 41%     +54.6%      53318 ± 19%  numa-vmstat.node1.nr_zone_active_anon
      1419 ±122%    +162.7%       3729 ± 45%  numa-vmstat.node1.nr_zone_inactive_anon
      9145 ±104%    +188.9%      26423 ± 48%  numa-vmstat.node3.nr_active_anon
      1476 ± 52%    +432.0%       7853 ±101%  numa-vmstat.node3.nr_anon_pages
      3918 ± 20%     +78.8%       7005 ± 20%  numa-vmstat.node3.nr_slab_reclaimable
     12499 ±  7%     +22.7%      15336 ± 12%  numa-vmstat.node3.nr_slab_unreclaimable
      9145 ±104%    +188.9%      26423 ± 48%  numa-vmstat.node3.nr_zone_active_anon
    194801 ± 28%     -71.7%      55131 ± 83%  numa-meminfo.node0.Active
    191699 ± 28%     -71.2%      55131 ± 83%  numa-meminfo.node0.Active(anon)
     38555 ± 84%    -100.0%       0.00        numa-meminfo.node0.AnonHugePages
    121677 ± 45%     -70.9%      35356 ±111%  numa-meminfo.node0.AnonPages
    315090 ± 10%     -24.6%     237568 ±  7%  numa-meminfo.node0.FilePages
     36584 ± 19%     -42.8%      20912 ± 42%  numa-meminfo.node0.KReclaimable
      9594 ±  8%     -10.3%       8603 ±  7%  numa-meminfo.node0.KernelStack
     10625 ± 34%     -32.6%       7165 ± 36%  numa-meminfo.node0.Mapped
    892127 ± 11%     -25.0%     668837 ± 13%  numa-meminfo.node0.MemUsed
     36584 ± 19%     -42.8%      20912 ± 42%  numa-meminfo.node0.SReclaimable
     75840 ±  7%     -17.8%      62378 ± 14%  numa-meminfo.node0.SUnreclaim
     72956 ± 45%     -90.1%       7212 ±118%  numa-meminfo.node0.Shmem
    112426 ± 11%     -25.9%      83290 ± 21%  numa-meminfo.node0.Slab
    138083 ± 41%     +54.5%     213397 ± 19%  numa-meminfo.node1.Active(anon)
    241919 ±  6%      +9.6%     265074 ±  9%  numa-meminfo.node1.FilePages
      5679 ±122%    +162.6%      14916 ± 45%  numa-meminfo.node1.Inactive(anon)
      6284 ±112%    +440.7%      33980 ± 73%  numa-meminfo.node1.Shmem
     36547 ±104%    +189.0%     105608 ± 48%  numa-meminfo.node3.Active
     36547 ±104%    +189.0%     105608 ± 48%  numa-meminfo.node3.Active(anon)
      5876 ± 52%    +433.8%      31369 ±101%  numa-meminfo.node3.AnonPages
     15671 ± 20%     +78.8%      28021 ± 20%  numa-meminfo.node3.KReclaimable
    592511 ±  8%     +15.3%     683394 ±  8%  numa-meminfo.node3.MemUsed
     15671 ± 20%     +78.8%      28021 ± 20%  numa-meminfo.node3.SReclaimable
     49997 ±  7%     +22.7%      61344 ± 12%  numa-meminfo.node3.SUnreclaim
     65670 ±  6%     +36.1%      89366 ± 14%  numa-meminfo.node3.Slab


                                                                                
                             will-it-scale.per_process_ops                      
                                                                                
  1.84e+06 +-+--------------------------------------------------------------+   
           |                                                                |   
  1.82e+06 +-+.. .+.+..   .+..+.+.+..+.+.+.. .+.+.+..   .+..   .+..+.      .|   
   1.8e+06 +-+  +      +.+                  +        +.+    +.+      +.+..+ |   
           |                                                                |   
  1.78e+06 +-+                                                              |   
           |                                                                |   
  1.76e+06 +-+                                                              |   
           |                                                                |   
  1.74e+06 +-+                                                              |   
  1.72e+06 +-+                O O        O  O   O O  O   O    O             |   
           |               O      O  O O      O        O    O   O  O        |   
   1.7e+06 +-+         O O                                                  |   
           O O  O O O                                                       |   
  1.68e+06 +-+--------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                will-it-scale.workload                          
                                                                                
   3.5e+08 +-+--------------------------------------------------------------+   
           |.+..+.+       +   +      +.+    +.+   +..+.+.+..+.+.+..+.+.  .+.|   
  3.45e+08 +-+         +.+                                             +.   |   
           |                                                                |   
           |                                                                |   
   3.4e+08 +-+                                                              |   
           |                                                                |   
  3.35e+08 +-+                                                              |   
           |                                                                |   
   3.3e+08 +-+                O O      O O  O O O O  O O O  O O O           |   
           |               O      O  O                             O        |   
           |           O O                                                  |   
  3.25e+08 O-O  O O O                                                       |   
           |                                                                |   
   3.2e+08 +-+--------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen

View attachment "config-5.0.0-rc4-00005-ge7ae097" of type "text/plain" (169058 bytes)

View attachment "job-script" of type "text/plain" (7394 bytes)

View attachment "job.yaml" of type "text/plain" (4815 bytes)

View attachment "reproduce" of type "text/plain" (311 bytes)

Powered by blists - more mailing lists