lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 Jan 2018 11:05:25 +0900
From:   Joonsoo Kim <iamjoonsoo.kim@....com>
To:     kernel test robot <xiaolong.ye@...el.com>
Cc:     Stephen Rothwell <sfr@...b.auug.org.au>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Tony Lindgren <tony@...mide.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Laura Abbott <lauraa@...eaurora.org>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...e.com>,
        Michal Nazarewicz <mina86@...a86.com>,
        Minchan Kim <minchan@...nel.org>,
        Rik van Riel <riel@...hat.com>,
        Russell King <linux@...linux.org.uk>,
        Will Deacon <will.deacon@....com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...org
Subject: Re: [lkp-robot] [mm/cma]  2b0f904a5a:  fio.read_bw_MBps -16.1%
 regression

Hello!

On Tue, Jan 02, 2018 at 02:35:28PM +0800, kernel test robot wrote:
> 
> Greeting,
> 
> FYI, we noticed a -16.1% regression of fio.read_bw_MBps due to commit:
> 
> 
> commit: 2b0f904a5a8781498417d67226fd12c5e56053ae ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> in testcase: fio-basic
> on test machine: 56 threads Intel(R) Xeon(R) CPU E5-2695 v3 @ 2.30GHz with 256G memory
> with following parameters:
> 
> 	disk: 2pmem
> 	fs: ext4
> 	runtime: 200s
> 	nr_task: 50%
> 	time_based: tb
> 	rw: randread
> 	bs: 2M
> 	ioengine: mmap
> 	test_size: 200G
> 	cpufreq_governor: performance
> 
> test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
> test-url: https://github.com/axboe/fio
> 
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> =========================================================================================
> bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based:
>   2M/gcc-7/performance/2pmem/ext4/mmap/x86_64-rhel-7.2/50%/debian-x86_64-2016-08-31.cgz/200s/randread/lkp-hsw-ep6/200G/fio-basic/tb
> 
> commit: 
>   f6572f9cd2 ("mm/page_alloc: don't reserve ZONE_HIGHMEM for ZONE_MOVABLE request")
>   2b0f904a5a ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
> 
> f6572f9cd248df2c 2b0f904a5a8781498417d67226 
> ---------------- -------------------------- 
>          %stddev     %change         %stddev
>              \          |                \  
>      11451           -16.1%       9605        fio.read_bw_MBps
>       0.29 ±  5%      +0.1        0.40 ±  3%  fio.latency_1000us%
>      19.35 ±  5%      -4.7       14.69 ±  3%  fio.latency_10ms%
>       7.92 ±  3%     +12.2       20.15        fio.latency_20ms%
>       0.05 ± 11%      +0.0        0.09 ±  8%  fio.latency_2ms%
>      70.22            -8.9       61.36        fio.latency_4ms%
>       0.29 ± 13%      +0.0        0.33 ±  3%  fio.latency_500us%
>       0.45 ± 29%      +1.0        1.45 ±  4%  fio.latency_50ms%
>       1.37            +0.1        1.44        fio.latency_750us%
>       9792           +31.7%      12896        fio.read_clat_90%_us
>      10560           +33.0%      14048        fio.read_clat_95%_us
>      15376 ± 10%     +46.9%      22592        fio.read_clat_99%_us
>       4885           +19.2%       5825        fio.read_clat_mean_us
>       5725           -16.1%       4802        fio.read_iops
>  4.598e+09           -16.4%  3.845e+09        fio.time.file_system_inputs
>     453153            -8.4%     415215        fio.time.involuntary_context_switches
>  5.748e+08           -16.4%  4.806e+08        fio.time.major_page_faults
>    1822257           +23.7%    2254706        fio.time.maximum_resident_set_size
>       5089            +1.6%       5172        fio.time.system_time
>     514.50           -16.3%     430.48        fio.time.user_time

System time is increased and user time is decreased. On the below, there is a clue.

>      24569 ±  2%      +9.6%      26917 ±  2%  fio.time.voluntary_context_switches
>   54443725           -14.9%   46353339        interrupts.CAL:Function_call_interrupts
>       0.00 ± 79%      -0.0        0.00 ± 17%  mpstat.cpu.iowait%
>       4.45            -0.7        3.71        mpstat.cpu.usr%
>    1467516           +21.3%    1779543 ±  3%  meminfo.Active
>    1276031           +23.7%    1578443 ±  4%  meminfo.Active(file)
>      25789 ±  3%     -76.7%       6013 ±  4%  meminfo.CmaFree
>  1.296e+08           -12.6%  1.133e+08        turbostat.IRQ
>      41.89            -3.4%      40.47        turbostat.RAMWatt
>      17444 ±  2%     -13.5%      15092 ±  3%  turbostat.SMI
>   10896428           -16.4%    9111830        vmstat.io.bi
>       6010            -6.2%       5637        vmstat.system.cs
>     317438           -12.1%     278980        vmstat.system.in
>    1072892 ±  3%     +21.5%    1303487        numa-meminfo.node0.Active
>     978318           +21.6%    1189809 ±  2%  numa-meminfo.node0.Active(file)
>     222968           -25.2%     166818        numa-meminfo.node0.PageTables
>      47374 ±  2%     +10.6%      52402 ±  7%  numa-meminfo.node0.SUnreclaim
>     165213           +31.9%     217870        numa-meminfo.node1.PageTables
>     222405           +10.4%     245633 ±  2%  numa-meminfo.node1.SReclaimable
>     102992 ± 46%     -80.8%      19812 ± 38%  numa-meminfo.node1.Shmem
>  2.475e+08 ±  2%     -24.0%  1.881e+08        numa-numastat.node0.local_node
>   39371795 ± 14%    +167.1%  1.052e+08 ±  2%  numa-numastat.node0.numa_foreign
>  2.475e+08 ±  2%     -24.0%  1.881e+08        numa-numastat.node0.numa_hit
>   31890417 ± 17%     +40.2%   44705135 ±  8%  numa-numastat.node0.numa_miss
>   31899482 ± 17%     +40.2%   44713255 ±  8%  numa-numastat.node0.other_node
>  2.566e+08 ±  2%     -44.2%  1.433e+08        numa-numastat.node1.local_node
>   31890417 ± 17%     +40.2%   44705135 ±  8%  numa-numastat.node1.numa_foreign
>  2.566e+08 ±  2%     -44.2%  1.433e+08        numa-numastat.node1.numa_hit
>   39371795 ± 14%    +167.1%  1.052e+08 ±  2%  numa-numastat.node1.numa_miss
>   39373660 ± 14%    +167.1%  1.052e+08 ±  2%  numa-numastat.node1.other_node
>       6047 ± 39%     -66.5%       2028 ± 63%  sched_debug.cfs_rq:/.exec_clock.min
>     461.37 ±  8%     +64.9%     760.74 ± 20%  sched_debug.cfs_rq:/.load_avg.avg
>       1105 ± 13%   +1389.3%      16467 ± 56%  sched_debug.cfs_rq:/.load_avg.max
>     408.99 ±  3%    +495.0%       2433 ± 49%  sched_debug.cfs_rq:/.load_avg.stddev
>      28746 ± 12%     -18.7%      23366 ± 14%  sched_debug.cfs_rq:/.min_vruntime.min
>     752426 ±  3%     -12.7%     656636 ±  4%  sched_debug.cpu.avg_idle.avg
>     144956 ± 61%     -85.4%      21174 ± 26%  sched_debug.cpu.avg_idle.min
>     245684 ± 11%     +44.6%     355257 ±  2%  sched_debug.cpu.avg_idle.stddev
>     236035 ± 15%     +51.8%     358264 ± 16%  sched_debug.cpu.nr_switches.max
>      42039 ± 22%     +34.7%      56616 ±  8%  sched_debug.cpu.nr_switches.stddev
>       3204 ± 24%     -48.1%       1663 ± 30%  sched_debug.cpu.sched_count.min
>       2132 ± 25%     +38.7%       2957 ± 11%  sched_debug.cpu.sched_count.stddev
>      90.67 ± 32%     -71.8%      25.58 ± 26%  sched_debug.cpu.sched_goidle.min
>       6467 ± 15%     +22.3%       7912 ± 15%  sched_debug.cpu.ttwu_count.max
>       1513 ± 27%     -55.7%     670.92 ± 22%  sched_debug.cpu.ttwu_count.min
>       1025 ± 20%     +68.4%       1727 ±  9%  sched_debug.cpu.ttwu_count.stddev
>       1057 ± 16%     -62.9%     391.85 ± 31%  sched_debug.cpu.ttwu_local.min
>     244876           +21.6%     297770 ±  2%  numa-vmstat.node0.nr_active_file
>      88.00 ±  5%     +19.3%     105.00 ±  5%  numa-vmstat.node0.nr_isolated_file
>      55778           -25.1%      41765        numa-vmstat.node0.nr_page_table_pages
>      11843 ±  2%     +10.6%      13100 ±  7%  numa-vmstat.node0.nr_slab_unreclaimable
>     159.25 ± 42%     -74.9%      40.00 ± 52%  numa-vmstat.node0.nr_vmscan_immediate_reclaim
>     244862           +21.6%     297739 ±  2%  numa-vmstat.node0.nr_zone_active_file
>   19364320 ± 19%    +187.2%   55617595 ±  2%  numa-vmstat.node0.numa_foreign
>     268155 ±  3%     +49.6%     401089 ±  4%  numa-vmstat.node0.workingset_activate
>  1.229e+08           -19.0%   99590617        numa-vmstat.node0.workingset_refault
>       6345 ±  3%     -76.5%       1489 ±  3%  numa-vmstat.node1.nr_free_cma
>      41335           +32.0%      54552        numa-vmstat.node1.nr_page_table_pages
>      25770 ± 46%     -80.8%       4956 ± 38%  numa-vmstat.node1.nr_shmem
>      55684           +10.4%      61475 ±  2%  numa-vmstat.node1.nr_slab_reclaimable
>  1.618e+08 ±  8%     -47.6%   84846798 ± 17%  numa-vmstat.node1.numa_hit
>  1.617e+08 ±  8%     -47.6%   84676284 ± 17%  numa-vmstat.node1.numa_local
>   19365342 ± 19%    +187.2%   55620100 ±  2%  numa-vmstat.node1.numa_miss
>   19534837 ± 19%    +185.6%   55790654 ±  2%  numa-vmstat.node1.numa_other
>  1.296e+08           -21.0%  1.024e+08        numa-vmstat.node1.workingset_refault
>  1.832e+12            -7.5%  1.694e+12        perf-stat.branch-instructions
>       0.25            -0.0        0.23        perf-stat.branch-miss-rate%
>  4.666e+09           -16.0%  3.918e+09        perf-stat.branch-misses
>      39.88            +1.1       40.98        perf-stat.cache-miss-rate%
>  2.812e+10           -11.6%  2.485e+10        perf-stat.cache-misses
>  7.051e+10           -14.0%  6.064e+10        perf-stat.cache-references
>    1260521            -6.1%    1183071        perf-stat.context-switches
>       1.87            +9.6%       2.05        perf-stat.cpi
>       6707 ±  2%      -5.2%       6359        perf-stat.cpu-migrations
>       1.04 ± 11%      -0.3        0.77 ±  4%  perf-stat.dTLB-load-miss-rate%
>  2.365e+10 ±  7%     -25.9%  1.751e+10 ±  9%  perf-stat.dTLB-load-misses
>   1.05e+12 ±  4%      -9.5%  9.497e+11 ±  2%  perf-stat.dTLB-stores
>      28.16            +2.2       30.35 ±  2%  perf-stat.iTLB-load-miss-rate%
>   2.56e+08           -10.4%  2.295e+08        perf-stat.iTLB-loads
>  8.974e+12            -9.2%  8.151e+12        perf-stat.instructions
>      89411            -8.8%      81529        perf-stat.instructions-per-iTLB-miss
>       0.54            -8.8%       0.49        perf-stat.ipc
>  5.748e+08           -16.4%  4.806e+08        perf-stat.major-faults
>      52.82            +5.8       58.61 ±  2%  perf-stat.node-load-miss-rate%
>  7.206e+09 ±  2%     -18.6%  5.867e+09 ±  3%  perf-stat.node-loads
>      17.96 ±  8%     +15.7       33.69 ±  2%  perf-stat.node-store-miss-rate%
>  2.055e+09 ±  8%     +65.1%  3.393e+09 ±  4%  perf-stat.node-store-misses
>  9.391e+09 ±  2%     -28.9%  6.675e+09        perf-stat.node-stores
>  5.753e+08           -16.4%  4.811e+08        perf-stat.page-faults
>     305865           -16.3%     256108        proc-vmstat.allocstall_movable
>       1923 ± 14%     -72.1%     537.00 ± 12%  proc-vmstat.allocstall_normal
>       0.00            +Inf%       1577 ± 67%  proc-vmstat.compact_isolated
>       1005 ±  4%     -65.8%     344.00 ±  7%  proc-vmstat.kswapd_low_wmark_hit_quickly
>     320062           +23.2%     394374 ±  4%  proc-vmstat.nr_active_file
>       6411 ±  2%     -76.4%       1511 ±  4%  proc-vmstat.nr_free_cma
>     277.00 ± 12%     -51.4%     134.75 ± 52%  proc-vmstat.nr_vmscan_immediate_reclaim
>     320049           +23.2%     394353 ±  4%  proc-vmstat.nr_zone_active_file
>   71262212 ± 15%    +110.3%  1.499e+08 ±  3%  proc-vmstat.numa_foreign
>  5.042e+08 ±  2%     -34.3%  3.314e+08        proc-vmstat.numa_hit
>  5.041e+08 ±  2%     -34.3%  3.314e+08        proc-vmstat.numa_local
>   71262212 ± 15%    +110.3%  1.499e+08 ±  3%  proc-vmstat.numa_miss
>   71273176 ± 15%    +110.3%  1.499e+08 ±  3%  proc-vmstat.numa_other
>       1007 ±  4%     -65.6%     346.25 ±  7%  proc-vmstat.pageoutrun
>   23070268           -16.0%   19386190        proc-vmstat.pgalloc_dma32
>  5.525e+08           -16.7%  4.603e+08        proc-vmstat.pgalloc_normal
>  5.753e+08           -16.4%  4.812e+08        proc-vmstat.pgfault
>  5.751e+08           -16.3%  4.813e+08        proc-vmstat.pgfree
>  5.748e+08           -16.4%  4.806e+08        proc-vmstat.pgmajfault
>  2.299e+09           -16.4%  1.923e+09        proc-vmstat.pgpgin
>  8.396e+08           -17.8%  6.901e+08        proc-vmstat.pgscan_direct
>  3.018e+08 ±  2%     -13.0%  2.627e+08        proc-vmstat.pgscan_kswapd
>    4.1e+08           -15.1%   3.48e+08        proc-vmstat.pgsteal_direct
>  1.542e+08 ±  3%     -20.9%   1.22e+08 ±  3%  proc-vmstat.pgsteal_kswapd
>      23514 ±  4%     -23.1%      18076 ± 16%  proc-vmstat.slabs_scanned
>     343040 ±  2%     +40.3%     481253 ±  2%  proc-vmstat.workingset_activate
>  2.525e+08           -20.1%  2.018e+08        proc-vmstat.workingset_refault
>      13.64 ±  3%      -1.7       11.96 ±  2%  perf-profile.calltrace.cycles-pp.ext4_mpage_readpages.filemap_fault.ext4_filemap_fault.__do_fault.__handle_mm_fault
>      11.67 ±  3%      -1.4       10.29 ±  2%  perf-profile.calltrace.cycles-pp.submit_bio.ext4_mpage_readpages.filemap_fault.ext4_filemap_fault.__do_fault
>      11.64 ±  3%      -1.4       10.25 ±  2%  perf-profile.calltrace.cycles-pp.generic_make_request.submit_bio.ext4_mpage_readpages.filemap_fault.ext4_filemap_fault
>      11.10 ±  3%      -1.3        9.82 ±  2%  perf-profile.calltrace.cycles-pp.pmem_make_request.generic_make_request.submit_bio.ext4_mpage_readpages.filemap_fault
>       9.21 ±  3%      -1.2        8.04 ±  3%  perf-profile.calltrace.cycles-pp.pmem_do_bvec.pmem_make_request.generic_make_request.submit_bio.ext4_mpage_readpages
>      27.33 ±  4%      -1.0       26.35 ±  5%  perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64
>      27.33 ±  4%      -1.0       26.35 ±  5%  perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>      27.33 ±  4%      -1.0       26.35 ±  5%  perf-profile.calltrace.cycles-pp.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64
>      27.33 ±  4%      -1.0       26.35 ±  5%  perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64
>      26.79 ±  4%      -0.8       25.98 ±  5%  perf-profile.calltrace.cycles-pp.intel_idle.cpuidle_enter_state.do_idle.cpu_startup_entry.start_secondary
>      27.98 ±  3%      -0.8       27.22 ±  4%  perf-profile.calltrace.cycles-pp.secondary_startup_64
>       5.36 ± 12%      -0.6        4.76 ±  7%  perf-profile.calltrace.cycles-pp.kswapd.kthread.ret_from_fork
>       5.36 ± 12%      -0.6        4.76 ±  7%  perf-profile.calltrace.cycles-pp.shrink_node.kswapd.kthread.ret_from_fork
>       5.30 ± 12%      -0.6        4.71 ±  7%  perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.kswapd.kthread
>       5.35 ± 12%      -0.6        4.76 ±  7%  perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.kswapd.kthread.ret_from_fork
>       5.43 ± 12%      -0.5        4.88 ±  7%  perf-profile.calltrace.cycles-pp.ret_from_fork
>       5.43 ± 12%      -0.5        4.88 ±  7%  perf-profile.calltrace.cycles-pp.kthread.ret_from_fork
>      11.04 ±  2%      -0.2       10.82 ±  2%  perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages
>      62.44 ±  2%      +1.9       64.38        perf-profile.calltrace.cycles-pp.page_fault
>      62.38 ±  2%      +2.0       64.33        perf-profile.calltrace.cycles-pp.__do_page_fault.do_page_fault.page_fault
>      62.38 ±  2%      +2.0       64.34        perf-profile.calltrace.cycles-pp.do_page_fault.page_fault
>      61.52 ±  2%      +2.1       63.58        perf-profile.calltrace.cycles-pp.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>      61.34 ±  2%      +2.1       63.44        perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>      30.18 ±  3%      +2.3       32.45 ±  2%  perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages
>       7.98 ±  3%      +2.3       10.33 ±  2%  perf-profile.calltrace.cycles-pp.add_to_page_cache_lru.filemap_fault.ext4_filemap_fault.__do_fault.__handle_mm_fault
>      30.48 ±  3%      +2.4       32.83 ±  2%  perf-profile.calltrace.cycles-pp.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.filemap_fault.ext4_filemap_fault
>      30.46 ±  3%      +2.4       32.81 ±  2%  perf-profile.calltrace.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.filemap_fault
>      30.46 ±  3%      +2.4       32.81 ±  2%  perf-profile.calltrace.cycles-pp.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask
>      30.37 ±  3%      +2.4       32.75 ±  2%  perf-profile.calltrace.cycles-pp.shrink_node_memcg.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath
>       5.58 ±  4%      +2.5        8.08 ±  2%  perf-profile.calltrace.cycles-pp.__lru_cache_add.add_to_page_cache_lru.filemap_fault.ext4_filemap_fault.__do_fault
>      32.88 ±  3%      +2.5       35.38 ±  2%  perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.filemap_fault.ext4_filemap_fault.__do_fault.__handle_mm_fault
>       5.51 ±  4%      +2.5        8.02 ±  2%  perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru.filemap_fault.ext4_filemap_fault
>       4.24 ±  4%      +2.5        6.76 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru.filemap_fault
>       4.18 ±  4%      +2.5        6.70 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.pagevec_lru_move_fn.__lru_cache_add.add_to_page_cache_lru
>      18.64 ±  3%      +2.5       21.16 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_node_memcg.shrink_node
>      31.65 ±  3%      +2.7       34.31 ±  2%  perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.filemap_fault.ext4_filemap_fault.__do_fault
>      17.21 ±  3%      +2.7       19.93 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_node_memcg.shrink_node.do_try_to_free_pages

It looks like there is more lru lock contention. It would be caused by
using a movable zone for the CMA memory by this patch. In this case,
reclaim for normal memory skips the lru page on the movable zone so needs
more time to find enough reclaim target pages. It would increase lru lock
holding time and then cause contention.

Could you give me another stat 'pgskip_XXX' in /proc/vmstat to confirm
my theory?

BTW, other stats for memory management looks better than before,
less nr_free_cma, less pgmajfault, less reclaim scan/steal.

Thanks.

Powered by blists - more mailing lists