lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 21 Sep 2019 23:25:22 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     virtio-dev@...ts.oasis-open.org, kvm@...r.kernel.org,
        mst@...hat.com, david@...hat.com, dave.hansen@...el.com,
        linux-kernel@...r.kernel.org, willy@...radead.org,
        mhocko@...nel.org, linux-mm@...ck.org, vbabka@...e.cz,
        akpm@...ux-foundation.org, mgorman@...hsingularity.net,
        linux-arm-kernel@...ts.infradead.org, osalvador@...e.de,
        yang.zhang.wz@...il.com, pagupta@...hat.com,
        konrad.wilk@...cle.com, nitesh@...hat.com, riel@...riel.com,
        lcapitulino@...hat.com, wei.w.wang@...el.com, aarcange@...hat.com,
        pbonzini@...hat.com, dan.j.williams@...el.com,
        alexander.h.duyck@...ux.intel.com, lkp@...org
Subject: [mm]  0f5b256b2c:  will-it-scale.per_process_ops -1.2% regression

Greeting,

FYI, we noticed a -1.2% regression of will-it-scale.per_process_ops due to commit:


commit: 0f5b256b2c35bf7d0faf874ed01227b4b7cb0118 ("[PATCH v10 3/6] mm: Introduce Reported pages")
url: https://github.com/0day-ci/linux/commits/Alexander-Duyck/mm-virtio-Provide-support-for-unused-page-reporting/20190919-015544


in testcase: will-it-scale
on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
with following parameters:

	nr_task: 100%
	mode: process
	test: page_fault2
	cpufreq_governor: performance
	ucode: 0xb000036

test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-7/performance/x86_64-rhel-7.6/process/100%/debian-x86_64-2019-05-14.cgz/lkp-bdw-ep6/page_fault2/will-it-scale/0xb000036

commit: 
  e10e2ab29d ("mm: Use zone and order instead of free area in free_list manipulators")
  0f5b256b2c ("mm: Introduce Reported pages")

e10e2ab29d6d4ee2 0f5b256b2c35bf7d0faf874ed01 
---------------- --------------------------- 
       fail:runs  %reproduction    fail:runs
           |             |             |    
          1:4          -25%            :4     dmesg.WARNING:at_ip___perf_sw_event/0x
          1:4          -25%            :4     dmesg.WARNING:at_ip__fsnotify_parent/0x
          3:4            1%           3:4     perf-profile.calltrace.cycles-pp.error_entry.testcase
          3:4            1%           3:4     perf-profile.children.cycles-pp.error_entry
          2:4            1%           2:4     perf-profile.self.cycles-pp.error_entry
         %stddev     %change         %stddev
             \          |                \  
     83249            -1.2%      82239        will-it-scale.per_process_ops
   7325970            -1.2%    7237126        will-it-scale.workload
      1137 ±  2%      -2.8%       1105 ±  2%  vmstat.system.cs
      9785           +11.8%      10942 ±  5%  softirqs.CPU0.SCHED
      7282 ± 30%     +30.5%       9504 ±  9%  softirqs.CPU2.RCU
 2.211e+09            -1.2%  2.185e+09        proc-vmstat.numa_hit
 2.211e+09            -1.2%  2.185e+09        proc-vmstat.numa_local
 2.213e+09            -1.2%  2.186e+09        proc-vmstat.pgalloc_normal
 2.204e+09            -1.2%  2.178e+09        proc-vmstat.pgfault
 2.212e+09            -1.2%  2.185e+09        proc-vmstat.pgfree
    232.75 ± 21%    +359.8%       1070 ± 66%  interrupts.37:IR-PCI-MSI.1572868-edge.eth0-TxRx-3
    232.75 ± 21%    +359.8%       1070 ± 66%  interrupts.CPU16.37:IR-PCI-MSI.1572868-edge.eth0-TxRx-3
     34.00 ± 76%    +447.1%     186.00 ±114%  interrupts.CPU18.RES:Rescheduling_interrupts
    318.00 ± 42%     -65.8%     108.75 ± 77%  interrupts.CPU28.RES:Rescheduling_interrupts
    173.50 ± 32%    +143.7%     422.75 ± 28%  interrupts.CPU36.RES:Rescheduling_interrupts
     70.75 ± 73%    +726.1%     584.50 ± 60%  interrupts.CPU39.RES:Rescheduling_interrupts
     66.75 ± 38%     +78.7%     119.25 ± 19%  interrupts.CPU83.RES:Rescheduling_interrupts
    286.00 ± 93%     -88.0%      34.25 ± 97%  interrupts.CPU84.RES:Rescheduling_interrupts
  41205135            -3.7%   39666469        perf-stat.i.branch-misses
      1096 ±  2%      -2.9%       1064 ±  2%  perf-stat.i.context-switches
      3.60            +3.8%       3.74 ±  3%  perf-stat.i.cpi
     34.38 ±  2%      -4.6%      32.80 ±  2%  perf-stat.i.cpu-migrations
    547.67          +323.8%       2321 ± 80%  perf-stat.i.cycles-between-cache-misses
  71391394            -3.5%   68859821 ±  2%  perf-stat.i.dTLB-store-misses
  14877534            -3.1%   14415629 ±  2%  perf-stat.i.iTLB-load-misses
   7256992            -3.0%    7036423 ±  2%  perf-stat.i.minor-faults
 1.272e+08            -3.3%  1.231e+08        perf-stat.i.node-loads
      2.62            +1.0        3.64 ± 25%  perf-stat.i.node-store-miss-rate%
    847863            +4.6%     887118        perf-stat.i.node-store-misses
  31585215            -3.4%   30501990 ±  2%  perf-stat.i.node-stores
   7256096            -3.0%    7035925 ±  2%  perf-stat.i.page-faults
      0.33 ±  3%      +0.0        0.34 ±  2%  perf-stat.overall.node-load-miss-rate%
      2.61            +0.2        2.83 ±  2%  perf-stat.overall.node-store-miss-rate%
   2791987            +1.1%    2822374        perf-stat.overall.path-length
  41058236            -3.7%   39547998        perf-stat.ps.branch-misses
     34.24 ±  2%      -4.5%      32.69        perf-stat.ps.cpu-migrations
  71150671            -3.5%   68679628 ±  2%  perf-stat.ps.dTLB-store-misses
  14827264            -3.0%   14377324        perf-stat.ps.iTLB-load-misses
   7230962            -3.0%    7016784 ±  2%  perf-stat.ps.minor-faults
 1.268e+08            -3.2%  1.228e+08        perf-stat.ps.node-loads
    844990            +4.7%     884796        perf-stat.ps.node-store-misses
  31478503            -3.4%   30422235 ±  2%  perf-stat.ps.node-stores
   7230628            -3.0%    7016690 ±  2%  perf-stat.ps.page-faults
      4.10            -0.7        3.43        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte
      4.13            -0.7        3.47        perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault
      5.35            -0.7        4.69        perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault
      5.44            -0.7        4.78        perf-profile.calltrace.cycles-pp.__lru_cache_add.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault
      7.50            -0.6        6.87        perf-profile.calltrace.cycles-pp.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      7.41            -0.6        6.78        perf-profile.calltrace.cycles-pp.alloc_set_pte.finish_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
     53.51            -0.3       53.17        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
     53.93            -0.3       53.62        perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault.testcase
      1.90            -0.3        1.60 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.unmap_page_range
      1.92            -0.3        1.62 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.release_pages.tlb_flush_mmu.unmap_page_range.unmap_vmas
     54.98            -0.3       54.69        perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault.testcase
     55.33            -0.3       55.04        perf-profile.calltrace.cycles-pp.do_page_fault.page_fault.testcase
     61.72            -0.3       61.44        perf-profile.calltrace.cycles-pp.testcase
     59.26            -0.2       59.03        perf-profile.calltrace.cycles-pp.page_fault.testcase
      0.74 ±  2%      -0.2        0.52 ±  3%  perf-profile.calltrace.cycles-pp.__list_del_entry_valid.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.__handle_mm_fault
      4.05            +0.0        4.09        perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap
      4.08            +0.0        4.12        perf-profile.calltrace.cycles-pp.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap
      4.10            +0.0        4.14        perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
      1.59            +0.1        1.64        perf-profile.calltrace.cycles-pp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault
      1.73            +0.1        1.78        perf-profile.calltrace.cycles-pp.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault
      1.21            +0.1        1.28        perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault
      0.97            +0.1        1.03        perf-profile.calltrace.cycles-pp.find_get_entry.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault
      1.38            +0.1        1.44        perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault
      3.71            +0.1        3.79        perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region
      3.64            +0.1        3.73        perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_flush_mmu.tlb_finish_mmu
     33.13            +0.2       33.33        perf-profile.calltrace.cycles-pp.unmap_vmas.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
     33.12            +0.2       33.32        perf-profile.calltrace.cycles-pp.unmap_page_range.unmap_vmas.unmap_region.__do_munmap.__vm_munmap
     31.65            +0.2       31.87        perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.unmap_page_range.unmap_vmas.unmap_region
     31.88            +0.2       32.11        perf-profile.calltrace.cycles-pp.tlb_flush_mmu.unmap_page_range.unmap_vmas.unmap_region.__do_munmap
     37.24            +0.2       37.48        perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap
     37.24            +0.2       37.48        perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
     37.24            +0.2       37.49        perf-profile.calltrace.cycles-pp.munmap
     37.23            +0.2       37.48        perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
     37.23            +0.2       37.48        perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
     37.23            +0.2       37.48        perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
     37.23            +0.2       37.48        perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
     28.96            +0.6       29.55        perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.tlb_flush_mmu.unmap_page_range.unmap_vmas
     28.48            +0.6       29.07        perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_flush_mmu.unmap_page_range
     30.94            +0.7       31.65        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.free_pcppages_bulk.free_unref_page_list.release_pages
     31.01            +0.7       31.72        perf-profile.calltrace.cycles-pp._raw_spin_lock.free_pcppages_bulk.free_unref_page_list.release_pages.tlb_flush_mmu
      6.32            -1.0        5.30        perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      5.45            -0.7        4.79        perf-profile.children.cycles-pp.__lru_cache_add
      5.36            -0.7        4.70        perf-profile.children.cycles-pp.pagevec_lru_move_fn
      7.46            -0.6        6.82        perf-profile.children.cycles-pp.alloc_set_pte
      7.51            -0.6        6.88        perf-profile.children.cycles-pp.finish_fault
     53.54            -0.3       53.20        perf-profile.children.cycles-pp.__handle_mm_fault
     53.96            -0.3       53.65        perf-profile.children.cycles-pp.handle_mm_fault
     55.00            -0.3       54.71        perf-profile.children.cycles-pp.__do_page_fault
     55.34            -0.3       55.05        perf-profile.children.cycles-pp.do_page_fault
     57.38            -0.3       57.12        perf-profile.children.cycles-pp.page_fault
     62.64            -0.3       62.38        perf-profile.children.cycles-pp.testcase
      0.99            -0.2        0.76        perf-profile.children.cycles-pp.__list_del_entry_valid
      0.39            -0.0        0.36 ±  2%  perf-profile.children.cycles-pp.__mod_lruvec_state
      4.11            +0.0        4.14        perf-profile.children.cycles-pp.tlb_finish_mmu
      1.60            +0.0        1.65        perf-profile.children.cycles-pp.shmem_fault
      1.73            +0.1        1.79        perf-profile.children.cycles-pp.__do_fault
      1.40            +0.1        1.46        perf-profile.children.cycles-pp.shmem_getpage_gfp
      1.23            +0.1        1.29        perf-profile.children.cycles-pp.find_lock_entry
      0.97            +0.1        1.03        perf-profile.children.cycles-pp.find_get_entry
     33.13            +0.2       33.33        perf-profile.children.cycles-pp.unmap_vmas
     33.13            +0.2       33.33        perf-profile.children.cycles-pp.unmap_page_range
     37.24            +0.2       37.49        perf-profile.children.cycles-pp.munmap
     37.23            +0.2       37.48        perf-profile.children.cycles-pp.__do_munmap
     37.23            +0.2       37.48        perf-profile.children.cycles-pp.__x64_sys_munmap
     37.23            +0.2       37.48        perf-profile.children.cycles-pp.__vm_munmap
     37.23            +0.2       37.48        perf-profile.children.cycles-pp.unmap_region
     37.33            +0.2       37.57        perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
     37.33            +0.2       37.57        perf-profile.children.cycles-pp.do_syscall_64
     35.97            +0.3       36.22        perf-profile.children.cycles-pp.tlb_flush_mmu
     35.83            +0.3       36.09        perf-profile.children.cycles-pp.release_pages
     32.70            +0.7       33.38        perf-profile.children.cycles-pp.free_unref_page_list
     32.15            +0.7       32.84        perf-profile.children.cycles-pp.free_pcppages_bulk
     64.33            +0.9       65.19        perf-profile.children.cycles-pp._raw_spin_lock
      0.98            -0.2        0.75        perf-profile.self.cycles-pp.__list_del_entry_valid
      0.95            -0.0        0.92        perf-profile.self.cycles-pp.free_pcppages_bulk
      0.14 ±  3%      -0.0        0.13 ±  3%  perf-profile.self.cycles-pp.__mod_lruvec_state
      0.26            +0.0        0.27        perf-profile.self.cycles-pp.handle_mm_fault
      0.17 ±  4%      +0.0        0.18 ±  2%  perf-profile.self.cycles-pp.__count_memcg_events
      0.62            +0.1        0.68        perf-profile.self.cycles-pp.find_get_entry
      0.89            +0.2        1.13        perf-profile.self.cycles-pp.get_page_from_freelist


                                                                                
                            will-it-scale.per_process_ops                       
                                                                                
  86000 +-+-----------------------------------------------------------------+   
        |.+.+.+..+.+.+.+.+.+.+.+..+.+.+.+.+.+.+.+..+.+.+.+.+.+.+.+..+       |   
  84000 +-+                                                          +      |   
        |                                                             +.+.+.|   
        |                                                    O   O  O       |   
  82000 +-+            O O O O O  O O O O O O O O  O O O O O   O            |   
        |                                                                   |   
  80000 +-+                                                                 |   
        |                                                                   |   
  78000 +-+        O                                                        |   
        | O          O                                                      |   
        |   O    O                                                          |   
  76000 +-+                                                                 |   
        O     O                                                             |   
  74000 +-+-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.3.0-03842-g0f5b256b2c35b" of type "text/plain" (199885 bytes)

View attachment "job-script" of type "text/plain" (7560 bytes)

View attachment "job.yaml" of type "text/plain" (5178 bytes)

View attachment "reproduce" of type "text/plain" (315 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ