[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200415051512.GS8179@shao2-debian>
Date: Wed, 15 Apr 2020 13:15:12 +0800
From: kernel test robot <rong.a.chen@...el.com>
To: liliangleo <liliang.opensource@...il.com>
Cc: Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
Mel Gorman <mgorman@...hsingularity.net>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
Andrea Arcangeli <aarcange@...hat.com>,
Dan Williams <dan.j.williams@...el.com>,
Dave Hansen <dave.hansen@...el.com>,
David Hildenbrand <david@...hat.com>,
Michal Hocko <mhocko@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Alex Williamson <alex.williamson@...hat.com>, lkp@...ts.01.org
Subject: [mm] 5ae8a9d7c8: will-it-scale.per_thread_ops -2.1% regression
Greeting,
FYI, we noticed a -2.1% regression of will-it-scale.per_thread_ops due to commit:
commit: 5ae8a9d7c84e7e6fa64ccaa357a1351015f1457c ("[RFC PATCH 4/4] mm: Add PG_zero support")
url: https://github.com/0day-ci/linux/commits/liliangleo/mm-Add-PG_zero-support/20200412-172834
in testcase: will-it-scale
on test machine: 8 threads Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz with 16G memory
with following parameters:
nr_task: 100%
mode: thread
test: page_fault1
cpufreq_governor: performance
ucode: 0x21
test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
test-url: https://github.com/antonblanchard/will-it-scale
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <rong.a.chen@...el.com>
Details are as below:
-------------------------------------------------------------------------------------------------->
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml
=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
gcc-7/performance/x86_64-rhel-7.6/thread/100%/debian-x86_64-20191114.cgz/lkp-ivb-d01/page_fault1/will-it-scale/0x21
commit:
0801ffd19f ("mm: add sys fs configuration for page reporting")
5ae8a9d7c8 ("mm: Add PG_zero support")
0801ffd19fa82207 5ae8a9d7c84e7e6fa64ccaa357a
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:4 25% 1:4 dmesg.RIP:__mnt_want_write
:4 25% 1:4 dmesg.RIP:loop
1:4 -25% :4 dmesg.RIP:poll_idle
1:4 -25% :4 kmsg.b44449d>]usb_hcd_irq
:4 25% 1:4 kmsg.b5a1a>]usb_hcd_irq
1:4 -25% :4 kmsg.c3ed91f>]usb_hcd_irq
:4 25% 1:4 kmsg.c6fae9f>]usb_hcd_irq
1:4 -25% :4 kmsg.c8b7ca>]usb_hcd_irq
%stddev %change %stddev
\ | \
536081 -2.1% 524633 will-it-scale.per_thread_ops
2523521 -2.1% 2470098 will-it-scale.time.minor_page_faults
107.85 -4.2% 103.29 will-it-scale.time.user_time
511850 -3.3% 495188 will-it-scale.time.voluntary_context_switches
4288652 -2.1% 4197068 will-it-scale.workload
4.87 +0.8 5.63 mpstat.cpu.all.idle%
3991 +10.2% 4397 ± 5% slabinfo.anon_vma.num_objs
142485 ± 8% -15.3% 120695 ± 3% softirqs.CPU7.TIMER
3005147 ± 5% +129.0% 6881725 ± 28% cpuidle.C1.time
66503 ± 2% +55.0% 103053 ± 22% cpuidle.C1.usage
32502781 ± 14% +26.6% 41156255 cpuidle.C3.time
23839328 ± 16% +45.3% 34642724 ± 8% cpuidle.C6.time
41675 ± 15% +58.6% 66107 ± 10% cpuidle.C6.usage
246196 -4.0% 236260 interrupts.CAL:Function_call_interrupts
8406 ± 32% +27.0% 10673 ± 27% interrupts.CPU4.NMI:Non-maskable_interrupts
8406 ± 32% +27.0% 10673 ± 27% interrupts.CPU4.PMI:Performance_monitoring_interrupts
11617 ± 21% -41.5% 6801 interrupts.CPU5.NMI:Non-maskable_interrupts
11617 ± 21% -41.5% 6801 interrupts.CPU5.PMI:Performance_monitoring_interrupts
320147 ± 22% -26.9% 233880 ± 2% sched_debug.cfs_rq:/.load.max
18580 ± 24% -32.6% 12525 ± 28% sched_debug.cpu.nr_switches.stddev
18333 ± 27% -34.9% 11930 ± 26% sched_debug.cpu.sched_count.stddev
8807 ± 28% -31.8% 6009 ± 19% sched_debug.cpu.ttwu_count.stddev
8775 ± 26% -31.9% 5973 ± 22% sched_debug.cpu.ttwu_local.stddev
5412715 -2.1% 5298836 proc-vmstat.numa_hit
5412715 -2.1% 5298836 proc-vmstat.numa_local
1.291e+09 -2.1% 1.265e+09 proc-vmstat.pgalloc_normal
2908824 -1.7% 2858835 proc-vmstat.pgfault
1.291e+09 -2.1% 1.265e+09 proc-vmstat.pgfree
2516340 -2.1% 2464229 proc-vmstat.thp_fault_alloc
86.60 -27.4 59.23 perf-profile.calltrace.cycles-pp.clear_page_erms.clear_subpage.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault
95.16 -0.6 94.56 perf-profile.calltrace.cycles-pp.page_fault
94.97 -0.6 94.38 perf-profile.calltrace.cycles-pp.handle_mm_fault.do_page_fault.page_fault
95.12 -0.6 94.53 perf-profile.calltrace.cycles-pp.do_page_fault.page_fault
94.88 -0.6 94.29 perf-profile.calltrace.cycles-pp.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_page_fault.page_fault
94.95 -0.6 94.36 perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_page_fault.page_fault
90.91 -0.4 90.56 perf-profile.calltrace.cycles-pp.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_page_fault
3.37 -0.2 3.18 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages_nodemask.alloc_pages_vma.do_huge_pmd_anonymous_page.__handle_mm_fault
3.38 -0.2 3.19 perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.alloc_pages_vma.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault
3.38 -0.2 3.20 perf-profile.calltrace.cycles-pp.alloc_pages_vma.do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault.do_page_fault
0.87 ± 2% -0.1 0.79 ± 4% perf-profile.calltrace.cycles-pp.rcu_all_qs._cond_resched.clear_huge_page.do_huge_pmd_anonymous_page.__handle_mm_fault
2.49 ± 4% +0.5 3.03 ± 4% perf-profile.calltrace.cycles-pp.munmap
2.48 ± 4% +0.5 3.02 ± 4% perf-profile.calltrace.cycles-pp.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
2.48 ± 4% +0.5 3.02 ± 4% perf-profile.calltrace.cycles-pp.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
2.48 ± 4% +0.5 3.03 ± 4% perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.munmap
2.48 ± 4% +0.5 3.03 ± 4% perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.munmap
2.44 ± 4% +0.5 2.99 ± 4% perf-profile.calltrace.cycles-pp.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64.entry_SYSCALL_64_after_hwframe
2.42 ± 4% +0.6 2.97 ± 4% perf-profile.calltrace.cycles-pp.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap.do_syscall_64
2.37 ± 4% +0.6 2.93 ± 4% perf-profile.calltrace.cycles-pp.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap.__x64_sys_munmap
2.37 ± 4% +0.6 2.93 ± 4% perf-profile.calltrace.cycles-pp.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap.__vm_munmap
2.32 ± 4% +0.6 2.88 ± 4% perf-profile.calltrace.cycles-pp.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region.__do_munmap
2.19 ± 4% +0.6 2.77 ± 4% perf-profile.calltrace.cycles-pp.__free_pages_ok.release_pages.tlb_flush_mmu.tlb_finish_mmu.unmap_region
0.00 +2.0 2.04 ± 2% perf-profile.calltrace.cycles-pp.clear_zero_page_flag.__free_pages_ok.release_pages.tlb_flush_mmu.tlb_finish_mmu
87.03 -27.5 59.52 perf-profile.children.cycles-pp.clear_page_erms
95.19 -0.6 94.59 perf-profile.children.cycles-pp.page_fault
95.14 -0.6 94.55 perf-profile.children.cycles-pp.do_page_fault
94.97 -0.6 94.38 perf-profile.children.cycles-pp.__handle_mm_fault
94.88 -0.6 94.29 perf-profile.children.cycles-pp.do_huge_pmd_anonymous_page
94.99 -0.6 94.41 perf-profile.children.cycles-pp.handle_mm_fault
90.97 -0.3 90.63 perf-profile.children.cycles-pp.clear_huge_page
88.69 -0.3 88.41 perf-profile.children.cycles-pp.clear_subpage
3.60 -0.2 3.40 perf-profile.children.cycles-pp.__alloc_pages_nodemask
3.58 -0.2 3.38 perf-profile.children.cycles-pp.get_page_from_freelist
3.40 -0.2 3.21 perf-profile.children.cycles-pp.alloc_pages_vma
0.71 ± 14% -0.2 0.53 ± 5% perf-profile.children.cycles-pp.apic_timer_interrupt
0.97 ± 3% -0.1 0.87 ± 3% perf-profile.children.cycles-pp._cond_resched
0.89 ± 2% -0.1 0.80 ± 3% perf-profile.children.cycles-pp.rcu_all_qs
0.65 ± 2% -0.0 0.60 ± 2% perf-profile.children.cycles-pp.prep_new_page
0.25 ± 9% -0.0 0.21 ± 7% perf-profile.children.cycles-pp.__hrtimer_run_queues
0.50 -0.0 0.47 ± 2% perf-profile.children.cycles-pp.prep_compound_page
2.49 ± 4% +0.5 3.03 ± 4% perf-profile.children.cycles-pp.munmap
2.48 ± 4% +0.5 3.03 ± 4% perf-profile.children.cycles-pp.__vm_munmap
2.48 ± 4% +0.5 3.03 ± 4% perf-profile.children.cycles-pp.__x64_sys_munmap
2.38 ± 4% +0.5 2.93 ± 4% perf-profile.children.cycles-pp.tlb_flush_mmu
2.45 ± 4% +0.6 3.00 ± 4% perf-profile.children.cycles-pp.__do_munmap
2.42 ± 4% +0.6 2.98 ± 4% perf-profile.children.cycles-pp.unmap_region
2.38 ± 4% +0.6 2.94 ± 4% perf-profile.children.cycles-pp.tlb_finish_mmu
2.33 ± 4% +0.6 2.90 ± 4% perf-profile.children.cycles-pp.release_pages
2.78 ± 3% +0.6 3.35 ± 3% perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
2.78 ± 3% +0.6 3.35 ± 3% perf-profile.children.cycles-pp.do_syscall_64
2.20 ± 4% +0.6 2.77 ± 3% perf-profile.children.cycles-pp.__free_pages_ok
0.00 +2.0 2.04 ± 3% perf-profile.children.cycles-pp.clear_zero_page_flag
86.48 -27.2 59.24 perf-profile.self.cycles-pp.clear_page_erms
2.09 ± 4% -1.5 0.63 ± 7% perf-profile.self.cycles-pp.__free_pages_ok
0.58 ± 2% -0.1 0.43 ± 7% perf-profile.self.cycles-pp.rcu_all_qs
0.50 ± 2% -0.0 0.47 ± 3% perf-profile.self.cycles-pp.prep_compound_page
0.37 ± 5% +0.0 0.41 ± 2% perf-profile.self.cycles-pp._cond_resched
0.00 +2.0 2.03 ± 3% perf-profile.self.cycles-pp.clear_zero_page_flag
1.87 ± 2% +27.0 28.87 perf-profile.self.cycles-pp.clear_subpage
will-it-scale.per_thread_ops
538000 +------------------------------------------------------------------+
| ++ + .++++.++++ ++ + ++ ++: +|
536000 |-+ ++ : +++.++ +.+ |
534000 |-+ +.++++.+ + |
| |
532000 |-+ |
| |
530000 |-+ |
| |
528000 |-+ |
526000 |-+ |
| O OO O OO |
524000 |O+ OO OOO OO OOOO OOOO |
| |
522000 +------------------------------------------------------------------+
will-it-scale.workload
4.32e+06 +----------------------------------------------------------------+
| |
4.3e+06 |++.+++++.+ ++. +++. ++++.++++.+++++.+++++.++ |
| : : ++ + : .++ +.++|
4.28e+06 |-+ ++ :++.+++++ ++ |
| + |
4.26e+06 |-+ |
| |
4.24e+06 |-+ |
| |
4.22e+06 |-+ |
| |
4.2e+06 |-O OOOOO OOOOO OOOO OOO |
|O O O O |
4.18e+06 +----------------------------------------------------------------+
will-it-scale.time.minor_page_faults
2.54e+06 +----------------------------------------------------------------+
|+ .+ + +. + +. |
2.53e+06 |-+ ++++.+ ++.+++++. + + ++++.++ + +++++.++ |
2.52e+06 |-+ : : + : ++.++ +.++|
| ++ +++.+++ ++ |
2.51e+06 |-+ |
| |
2.5e+06 |-+ |
| |
2.49e+06 |-+ |
2.48e+06 |-+ |
| |
2.47e+06 |-O OOOOO OOOOO OOOO OOO |
|O O O O |
2.46e+06 +----------------------------------------------------------------+
[*] bisect-good sample
[O] bisect-bad sample
Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.
Thanks,
Rong Chen
View attachment "config-5.6.0-12710-g5ae8a9d7c84e7" of type "text/plain" (206185 bytes)
View attachment "job-script" of type "text/plain" (7456 bytes)
View attachment "job.yaml" of type "text/plain" (5121 bytes)
View attachment "reproduce" of type "text/plain" (313 bytes)
Powered by blists - more mailing lists