lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 12 Aug 2020 17:15:32 +0800
From:   kernel test robot <rong.a.chen@...el.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Joonsoo Kim <js1304@...il.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Rik van Riel <riel@...riel.com>,
        Minchan Kim <minchan.kim@...il.com>,
        Michal Hocko <mhocko@...e.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org
Subject: [mm] 31d8fcac00: fio.read_iops 18.0% improvement

Greeting,

FYI, we noticed a 18.0% improvement of fio.read_iops due to commit:


commit: 31d8fcac00fcf4007f3921edc69ab4dcb3abcd4d ("mm: workingset: age nonresident information alongside anonymous pages")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: fio-basic
on test machine: 96 threads Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: xfs
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: read
	bs: 4k
	ioengine: sync
	test_size: 200G
	cpufreq_governor: performance
	ucode: 0x5002f01

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio





Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
  4k/gcc-9/performance/2pmem/xfs/sync/x86_64-rhel-8.3/50%/debian-10.4-x86_64-20200603.cgz/200s/read/lkp-csl-2sp6/200G/fio-basic/tb/0x5002f01

commit: 
  2a8bef3217 ("doc: THP CoW fault no longer allocate THP")
  31d8fcac00 ("mm: workingset: age nonresident information alongside anonymous pages")

2a8bef321749219a 31d8fcac00fcf4007f3921edc69 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.04 ±  2%      -0.0        0.01        fio.latency_1000us%
      0.02 ±  4%      -0.0        0.01 ±  4%  fio.latency_100us%
      0.55 ±  5%      -0.2        0.39 ± 10%  fio.latency_10us%
      0.02 ±  9%      +0.0        0.03        fio.latency_20ms%
     48.67            +1.1       49.76        fio.latency_2us%
      0.69            +0.1        0.76        fio.latency_500us%
     15586           +18.0%      18390        fio.read_bw_MBps
      2440 ±  2%      -4.9%       2320        fio.read_clat_95%_us
     11139 ±  2%     -16.1%       9351        fio.read_clat_mean_us
    314765 ±  3%     -31.2%     216687        fio.read_clat_stddev
   3990106           +18.0%    4707899        fio.read_iops
 6.387e+09           +18.0%  7.537e+09        fio.time.file_system_inputs
    337062            +8.2%     364554        fio.time.involuntary_context_switches
    288.56 ±  3%     +12.5%     324.76        fio.time.user_time
     23290            +4.3%      24298 ±  4%  fio.time.voluntary_context_switches
 7.984e+08           +18.0%  9.421e+08        fio.workload
      2238            -1.4%       2207        boot-time.idle
      1.50 ±  3%     +12.3%       1.68        iostat.cpu.user
      0.08 ±  2%      +0.0        0.09 ±  2%  mpstat.cpu.all.soft%
      1.51 ±  3%      +0.2        1.70        mpstat.cpu.all.usr%
  15692892           +17.9%   18495671        vmstat.io.bi
      5439            +6.5%       5793        vmstat.system.cs
   3091110 ± 16%     -45.6%    1682800 ± 51%  sched_debug.cpu.nr_switches.max
    362965 ±  8%     -41.0%     214260 ± 46%  sched_debug.cpu.nr_switches.stddev
     52.21 ± 11%    +103.8%     106.42 ± 30%  sched_debug.cpu.nr_uninterruptible.max
 3.523e+08 ±  4%     +20.0%  4.228e+08        numa-numastat.node0.local_node
 3.523e+08 ±  4%     +20.0%  4.228e+08        numa-numastat.node0.numa_hit
 3.762e+08           +17.6%  4.423e+08        numa-numastat.node1.local_node
 3.762e+08           +17.6%  4.424e+08        numa-numastat.node1.numa_hit
    216.00 ± 11%    +137.5%     513.00 ± 52%  slabinfo.xfs_btree_cur.active_objs
    216.00 ± 11%    +137.5%     513.00 ± 52%  slabinfo.xfs_btree_cur.num_objs
    325.50 ± 10%     +41.9%     462.00 ± 22%  slabinfo.xfs_buf.active_objs
    325.50 ± 10%     +41.9%     462.00 ± 22%  slabinfo.xfs_buf.num_objs
    186.50 ± 14%     -25.2%     139.50 ±  6%  numa-vmstat.node0.nr_isolated_file
   1202194 ± 47%   +2113.1%   26605889 ± 93%  numa-vmstat.node1.nr_dirtied
   1202190 ± 47%   +2113.1%   26605886 ± 93%  numa-vmstat.node1.nr_written
 1.832e+08           +32.4%  2.425e+08 ± 10%  numa-vmstat.node1.numa_hit
 1.832e+08           +32.3%  2.424e+08 ± 10%  numa-vmstat.node1.numa_local
     16408 ± 17%     -39.8%       9876 ± 12%  softirqs.CPU16.SCHED
      8006 ±  9%     +25.7%      10061 ± 16%  softirqs.CPU22.RCU
     98861 ± 10%     -19.5%      79540 ±  4%  softirqs.CPU31.TIMER
     10175 ± 13%     +48.0%      15056 ± 19%  softirqs.CPU39.SCHED
      7196 ±  5%     +24.1%       8932 ± 12%  softirqs.CPU46.RCU
     11185 ± 23%     +26.3%      14131 ± 11%  softirqs.CPU5.SCHED
     14741 ± 30%     +59.5%      23505 ± 10%  softirqs.CPU64.SCHED
      7005 ±  3%     +50.6%      10548 ± 29%  softirqs.CPU73.RCU
      7179 ±  9%     +47.9%      10619 ± 24%  softirqs.CPU74.RCU
     15477 ± 18%     -20.3%      12335 ±  3%  softirqs.CPU85.SCHED
     18412 ± 18%     -18.3%      15050 ± 11%  softirqs.CPU86.SCHED
     17781 ± 11%     -34.3%      11677 ± 24%  softirqs.CPU87.SCHED
    212024 ±  2%     +17.7%     249450        proc-vmstat.allocstall_movable
      4332 ± 14%     +33.2%       5773 ± 16%  proc-vmstat.compact_daemon_wake
     44.00 ± 32%     +98.9%      87.50 ± 23%  proc-vmstat.compact_fail
     48.75 ± 34%     +84.6%      90.00 ± 22%  proc-vmstat.compact_stall
     10203 ±  2%     +18.7%      12107 ±  2%  proc-vmstat.kswapd_low_wmark_hit_quickly
    402.75 ±  6%     -26.3%     296.75 ±  5%  proc-vmstat.nr_isolated_file
  70317824 ±  4%     +10.6%   77758339        proc-vmstat.numa_foreign
 7.281e+08 ±  2%     +18.7%  8.645e+08        proc-vmstat.numa_hit
 7.281e+08 ±  2%     +18.7%  8.645e+08        proc-vmstat.numa_local
  70317824 ±  4%     +10.6%   77758339        proc-vmstat.numa_miss
  70349397 ±  4%     +10.6%   77789919        proc-vmstat.numa_other
     10212 ±  2%     +18.8%      12129 ±  2%  proc-vmstat.pageoutrun
  28389101           +17.9%   33470811        proc-vmstat.pgalloc_dma32
 7.718e+08           +18.0%  9.105e+08        proc-vmstat.pgalloc_normal
 7.902e+08           +18.2%  9.341e+08        proc-vmstat.pgfree
 3.193e+09           +18.0%  3.767e+09        proc-vmstat.pgpgin
      6685 ±  6%     +39.3%       9312 ±  4%  proc-vmstat.pgrotated
 5.487e+08 ±  2%     +17.6%  6.452e+08        proc-vmstat.pgscan_direct
 6.059e+08           +18.6%  7.189e+08        proc-vmstat.pgscan_file
  57228405 ±  4%     +28.8%   73738107 ±  5%  proc-vmstat.pgscan_kswapd
 5.487e+08 ±  2%     +17.6%  6.451e+08        proc-vmstat.pgsteal_direct
 6.059e+08           +18.6%  7.189e+08        proc-vmstat.pgsteal_file
  57226925 ±  4%     +28.8%   73735659 ±  5%  proc-vmstat.pgsteal_kswapd
     76.25 ±163%     -98.0%       1.50 ±173%  interrupts.93:PCI-MSI.31981626-edge.i40e-eth0-TxRx-57
    277.75 ± 26%     +32.5%     368.00 ±  4%  interrupts.CPU1.RES:Rescheduling_interrupts
      6257 ± 10%     +23.6%       7734        interrupts.CPU11.NMI:Non-maskable_interrupts
      6257 ± 10%     +23.6%       7734        interrupts.CPU11.PMI:Performance_monitoring_interrupts
    286.00 ± 11%     +35.0%     386.00 ±  3%  interrupts.CPU11.RES:Rescheduling_interrupts
    262.25 ± 22%     +47.7%     387.25 ± 15%  interrupts.CPU13.RES:Rescheduling_interrupts
    602.75 ±  8%     +15.1%     693.50 ±  6%  interrupts.CPU14.CAL:Function_call_interrupts
    264.50 ± 26%     +36.5%     361.00 ± 14%  interrupts.CPU14.RES:Rescheduling_interrupts
    650.25 ±  6%     +21.7%     791.50 ± 13%  interrupts.CPU15.CAL:Function_call_interrupts
    599.50 ±  8%     +17.0%     701.50 ±  3%  interrupts.CPU16.CAL:Function_call_interrupts
    230.75 ± 23%     +65.8%     382.50 ±  3%  interrupts.CPU16.RES:Rescheduling_interrupts
    644.25           +11.6%     719.00 ±  3%  interrupts.CPU17.CAL:Function_call_interrupts
    303.25 ±  8%     +26.3%     383.00 ± 11%  interrupts.CPU19.RES:Rescheduling_interrupts
    320.25 ±  9%     +21.5%     389.25 ± 12%  interrupts.CPU20.RES:Rescheduling_interrupts
    636.25 ±  5%     +13.1%     719.50 ±  2%  interrupts.CPU22.CAL:Function_call_interrupts
    297.25 ± 10%     +30.4%     387.75 ±  5%  interrupts.CPU22.RES:Rescheduling_interrupts
    657.25 ±  6%     +12.2%     737.50 ± 10%  interrupts.CPU23.CAL:Function_call_interrupts
      2619 ± 74%    +165.1%       6944 ± 10%  interrupts.CPU23.NMI:Non-maskable_interrupts
      2619 ± 74%    +165.1%       6944 ± 10%  interrupts.CPU23.PMI:Performance_monitoring_interrupts
    330.50 ±  9%     +27.5%     421.25 ±  6%  interrupts.CPU23.RES:Rescheduling_interrupts
    346.75 ±  5%     -22.9%     267.25 ± 18%  interrupts.CPU39.RES:Rescheduling_interrupts
    269.00 ± 23%     +26.3%     339.75 ±  6%  interrupts.CPU4.RES:Rescheduling_interrupts
     42.00 ± 23%     +90.5%      80.00 ± 25%  interrupts.CPU5.TLB:TLB_shootdowns
     75.75 ±164%     -98.0%       1.50 ±173%  interrupts.CPU57.93:PCI-MSI.31981626-edge.i40e-eth0-TxRx-57
    523.75 ±  8%     +16.0%     607.75 ±  7%  interrupts.CPU57.CAL:Function_call_interrupts
    131.75 ± 27%     +57.7%     207.75 ±  7%  interrupts.CPU57.RES:Rescheduling_interrupts
    240.25 ± 42%     -43.5%     135.75 ± 35%  interrupts.CPU64.RES:Rescheduling_interrupts
    611.75 ±  5%     +33.7%     818.00 ± 17%  interrupts.CPU73.CAL:Function_call_interrupts
    249.50 ± 17%     +38.6%     345.75 ±  8%  interrupts.CPU73.RES:Rescheduling_interrupts
    263.50 ± 24%     +26.0%     332.00 ±  4%  interrupts.CPU74.RES:Rescheduling_interrupts
    234.50 ± 20%     +41.4%     331.50 ± 11%  interrupts.CPU76.RES:Rescheduling_interrupts
    259.00 ± 18%     +46.4%     379.25 ±  4%  interrupts.CPU8.RES:Rescheduling_interrupts
    175.25 ± 27%     +69.5%     297.00 ± 14%  interrupts.CPU86.RES:Rescheduling_interrupts
    592.50 ±  7%     +24.6%     738.25 ± 14%  interrupts.CPU87.CAL:Function_call_interrupts
      4215 ± 28%     +75.7%       7406 ±  3%  interrupts.CPU87.NMI:Non-maskable_interrupts
      4215 ± 28%     +75.7%       7406 ±  3%  interrupts.CPU87.PMI:Performance_monitoring_interrupts
    179.00 ± 26%     +88.4%     337.25 ± 18%  interrupts.CPU87.RES:Rescheduling_interrupts
    601.25 ±  9%     +18.6%     713.25 ± 12%  interrupts.CPU90.CAL:Function_call_interrupts
      4802 ± 30%     +53.8%       7387 ±  7%  interrupts.CPU94.NMI:Non-maskable_interrupts
      4802 ± 30%     +53.8%       7387 ±  7%  interrupts.CPU94.PMI:Performance_monitoring_interrupts
     24537           +11.9%      27446        interrupts.RES:Rescheduling_interrupts
     21.09            -5.3%      19.97        perf-stat.i.MPKI
 1.064e+10           +13.8%  1.211e+10        perf-stat.i.branch-instructions
  29423674           +13.8%   33475999        perf-stat.i.branch-misses
     63.23            +4.3       67.56        perf-stat.i.cache-miss-rate%
 7.081e+08           +16.0%  8.216e+08        perf-stat.i.cache-misses
 1.115e+09            +8.7%  1.212e+09        perf-stat.i.cache-references
      5428            +6.7%       5792        perf-stat.i.context-switches
      2.61           -13.3%       2.26        perf-stat.i.cpi
    124.06            +3.3%     128.15        perf-stat.i.cpu-migrations
    201.52           -13.8%     173.78        perf-stat.i.cycles-between-cache-misses
   1639762 ± 17%     +31.5%    2156578 ± 11%  perf-stat.i.dTLB-load-misses
 1.287e+10           +14.3%  1.471e+10        perf-stat.i.dTLB-loads
   1005363 ±  4%     +22.4%    1230907 ±  5%  perf-stat.i.dTLB-store-misses
 7.217e+09           +17.7%  8.494e+09        perf-stat.i.dTLB-stores
     78.13            -1.5       76.66        perf-stat.i.iTLB-load-miss-rate%
  11916696 ±  2%      +4.8%   12488907 ±  2%  perf-stat.i.iTLB-load-misses
   3329500 ±  2%     +14.1%    3800235 ±  2%  perf-stat.i.iTLB-loads
 5.285e+10           +14.6%  6.059e+10        perf-stat.i.instructions
      4426            +9.3%       4839 ±  2%  perf-stat.i.instructions-per-iTLB-miss
      0.39           +14.8%       0.44        perf-stat.i.ipc
    334.33           +14.7%     383.33        perf-stat.i.metric.M/sec
     32.99 ±  3%      -4.4       28.56 ±  2%  perf-stat.i.node-load-miss-rate%
  41465839 ±  2%      -9.2%   37638337        perf-stat.i.node-load-misses
  85424362 ±  3%     +11.7%   95403803        perf-stat.i.node-loads
  25306603 ±  3%      +6.5%   26954862        perf-stat.i.node-store-misses
  93535434 ±  3%     +11.4%  1.042e+08        perf-stat.i.node-stores
     21.11            -5.2%      20.00        perf-stat.overall.MPKI
     63.48            +4.3       67.79        perf-stat.overall.cache-miss-rate%
      2.59           -13.0%       2.25        perf-stat.overall.cpi
    193.30           -14.0%     166.22        perf-stat.overall.cycles-between-cache-misses
     78.16            -1.5       76.67        perf-stat.overall.iTLB-load-miss-rate%
      4435            +9.4%       4854 ±  2%  perf-stat.overall.instructions-per-iTLB-miss
      0.39           +14.9%       0.44        perf-stat.overall.ipc
     32.70 ±  3%      -4.4       28.29 ±  2%  perf-stat.overall.node-load-miss-rate%
     13283            -2.8%      12912        perf-stat.overall.path-length
 1.059e+10           +13.8%  1.205e+10        perf-stat.ps.branch-instructions
  29275920           +13.8%   33306390        perf-stat.ps.branch-misses
 7.046e+08           +16.0%  8.176e+08        perf-stat.ps.cache-misses
  1.11e+09            +8.7%  1.206e+09        perf-stat.ps.cache-references
      5399            +6.7%       5762        perf-stat.ps.context-switches
    123.68            +3.2%     127.60        perf-stat.ps.cpu-migrations
   1632115 ± 17%     +31.5%    2146796 ± 11%  perf-stat.ps.dTLB-load-misses
 1.281e+10           +14.3%  1.464e+10        perf-stat.ps.dTLB-loads
   1000774 ±  4%     +22.4%    1225428 ±  5%  perf-stat.ps.dTLB-store-misses
 7.182e+09           +17.7%  8.452e+09        perf-stat.ps.dTLB-stores
  11857454 ±  2%      +4.8%   12426582 ±  2%  perf-stat.ps.iTLB-load-misses
   3312649 ±  2%     +14.1%    3780904 ±  2%  perf-stat.ps.iTLB-loads
 5.259e+10           +14.6%  6.029e+10        perf-stat.ps.instructions
  41262577 ±  2%      -9.2%   37449492        perf-stat.ps.node-load-misses
  85001993 ±  3%     +11.7%   94931612        perf-stat.ps.node-loads
  25183856 ±  3%      +6.5%   26811052        perf-stat.ps.node-store-misses
  93070707 ±  3%     +11.4%  1.037e+08        perf-stat.ps.node-stores
 1.061e+13           +14.7%  1.216e+13        perf-stat.total.instructions
     28.88 ± 12%      -7.2       21.69 ± 13%  perf-profile.calltrace.cycles-pp.shrink_lruvec.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath
     28.84 ± 12%      -7.2       21.66 ± 13%  perf-profile.calltrace.cycles-pp.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages.try_to_free_pages
     29.18 ± 12%      -7.2       22.02 ± 13%  perf-profile.calltrace.cycles-pp.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.page_cache_readahead_unbounded
     29.18 ± 12%      -7.2       22.02 ± 13%  perf-profile.calltrace.cycles-pp.shrink_node.do_try_to_free_pages.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask
     29.20 ± 12%      -7.2       22.04 ± 13%  perf-profile.calltrace.cycles-pp.try_to_free_pages.__alloc_pages_slowpath.__alloc_pages_nodemask.page_cache_readahead_unbounded.generic_file_buffered_read
     34.34 ± 12%      -7.1       27.23 ± 13%  perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.page_cache_readahead_unbounded.generic_file_buffered_read.xfs_file_buffered_aio_read.xfs_file_read_iter
     29.56 ± 12%      -7.1       22.45 ± 13%  perf-profile.calltrace.cycles-pp.__alloc_pages_slowpath.__alloc_pages_nodemask.page_cache_readahead_unbounded.generic_file_buffered_read.xfs_file_buffered_aio_read
      9.88 ± 14%      -4.5        5.36 ± 13%  perf-profile.calltrace.cycles-pp.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_lruvec.shrink_node
     10.80 ± 12%      -4.1        6.70 ± 13%  perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
     17.53 ± 11%      -3.2       14.38 ± 15%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node
     17.09 ± 11%      -3.0       14.13 ± 13%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
      5.26 ± 16%      -2.2        3.08 ± 15%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_lruvec
      5.01 ± 16%      -2.2        2.86 ± 15%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__remove_mapping.shrink_page_list.shrink_inactive_list
      2.76 ± 13%      -1.7        1.03 ± 11%  perf-profile.calltrace.cycles-pp.workingset_eviction.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_lruvec
      1.14 ± 11%      -0.3        0.89 ± 12%  perf-profile.calltrace.cycles-pp.__delete_from_page_cache.__remove_mapping.shrink_page_list.shrink_inactive_list.shrink_lruvec
      0.85 ± 11%      +0.2        1.05 ±  3%  perf-profile.calltrace.cycles-pp.mem_cgroup_charge.__add_to_page_cache_locked.add_to_page_cache_lru.page_cache_readahead_unbounded.generic_file_buffered_read
      0.26 ±100%      +0.3        0.59 ± 13%  perf-profile.calltrace.cycles-pp.free_unref_page_list.shrink_page_list.shrink_inactive_list.shrink_lruvec.shrink_node
     30.25 ± 11%      -7.4       22.81 ± 13%  perf-profile.children.cycles-pp.shrink_inactive_list
     30.27 ± 11%      -7.4       22.84 ± 13%  perf-profile.children.cycles-pp.shrink_lruvec
     30.60 ± 11%      -7.4       23.19 ± 13%  perf-profile.children.cycles-pp.shrink_node
     29.34 ± 11%      -7.2       22.12 ± 13%  perf-profile.children.cycles-pp.try_to_free_pages
     29.33 ± 11%      -7.2       22.11 ± 13%  perf-profile.children.cycles-pp.do_try_to_free_pages
     29.71 ± 11%      -7.2       22.54 ± 13%  perf-profile.children.cycles-pp.__alloc_pages_slowpath
     34.55 ± 12%      -7.2       27.39 ± 13%  perf-profile.children.cycles-pp.__alloc_pages_nodemask
     37.91 ±  8%      -5.9       32.02 ± 12%  perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     10.41 ± 11%      -4.4        6.00 ± 13%  perf-profile.children.cycles-pp.__remove_mapping
     11.50 ± 11%      -4.3        7.25 ± 13%  perf-profile.children.cycles-pp.shrink_page_list
     18.06 ± 11%      -3.1       14.99 ± 13%  perf-profile.children.cycles-pp._raw_spin_lock_irq
     16.47 ±  5%      -2.8       13.69 ±  9%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      2.91 ± 13%      -1.8        1.09 ± 11%  perf-profile.children.cycles-pp.workingset_eviction
      0.82 ± 11%      -0.3        0.51 ± 11%  perf-profile.children.cycles-pp.unaccount_page_cache_page
      1.50 ± 10%      -0.2        1.25 ± 12%  perf-profile.children.cycles-pp.__delete_from_page_cache
      0.33 ± 14%      -0.1        0.20 ±  9%  perf-profile.children.cycles-pp.__isolate_lru_page
      0.17 ±  8%      +0.0        0.20 ±  6%  perf-profile.children.cycles-pp.shrink_slab
      0.18 ±  6%      +0.0        0.21 ±  6%  perf-profile.children.cycles-pp.page_counter_try_charge
      0.01 ±173%      +0.0        0.06 ±  9%  perf-profile.children.cycles-pp.kmem_cache_free
      0.16 ± 13%      +0.0        0.21 ±  3%  perf-profile.children.cycles-pp.get_mem_cgroup_from_mm
      0.22 ± 12%      +0.1        0.28 ±  4%  perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
      0.33 ±  8%      +0.1        0.39 ±  6%  perf-profile.children.cycles-pp.try_charge
      0.27 ± 14%      +0.1        0.33 ±  3%  perf-profile.children.cycles-pp.__count_memcg_events
      0.23 ±  8%      +0.1        0.31 ±  7%  perf-profile.children.cycles-pp.xa_load
      0.44 ± 17%      +0.1        0.55 ±  3%  perf-profile.children.cycles-pp.__mod_memcg_state
      0.60 ±  7%      +0.1        0.73 ±  9%  perf-profile.children.cycles-pp.xas_load
      0.85 ± 11%      +0.2        1.06 ±  3%  perf-profile.children.cycles-pp.mem_cgroup_charge
      0.00            +0.5        0.45 ± 11%  perf-profile.children.cycles-pp.workingset_age_nonresident
     37.91 ±  8%      -5.9       32.02 ± 12%  perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.53 ± 11%      -0.3        0.21 ± 13%  perf-profile.self.cycles-pp.unaccount_page_cache_page
      0.33 ± 14%      -0.1        0.20 ±  9%  perf-profile.self.cycles-pp.__isolate_lru_page
      0.39 ± 11%      -0.1        0.32 ± 13%  perf-profile.self.cycles-pp.__remove_mapping
      0.14 ± 14%      -0.0        0.10 ±  8%  perf-profile.self.cycles-pp.isolate_lru_pages
      0.16 ±  6%      +0.0        0.18 ±  5%  perf-profile.self.cycles-pp.page_counter_try_charge
      0.12 ± 12%      +0.0        0.15 ±  2%  perf-profile.self.cycles-pp.mem_cgroup_charge
      0.16 ± 13%      +0.0        0.21 ±  3%  perf-profile.self.cycles-pp.get_mem_cgroup_from_mm
      0.27 ± 14%      +0.1        0.33 ±  3%  perf-profile.self.cycles-pp.__count_memcg_events
      0.44 ± 17%      +0.1        0.54 ±  4%  perf-profile.self.cycles-pp.__mod_memcg_state
      0.51 ±  8%      +0.1        0.62 ±  9%  perf-profile.self.cycles-pp.xas_load
      0.45 ± 12%      +0.2        0.64 ± 12%  perf-profile.self.cycles-pp.workingset_eviction
      0.00            +0.5        0.45 ± 11%  perf-profile.self.cycles-pp.workingset_age_nonresident


                                                                                
                                     fio.read_iops                              
                                                                                
    5e+06 +-----------------------------------------------------------------+   
          |                                                                 |   
  4.8e+06 |-O            O             O    O    O    O O       O           |   
          |      O    O    O    O    O    O    O    O      O O    O         |   
  4.6e+06 |-+  O    O         O   O                                  O      |   
          |        .+                                                       |   
  4.4e+06 |.+..+.+.  +                                                      |   
          |           +                                                     |   
  4.2e+06 |-+          +                                                    |   
          |             +            +.                          .+..      .|   
    4e+06 |-+            +.+..     ..  +..+.+..  +..           .+    +.  .+ |   
          |                   +.+.+             +    .+.+..+.+.        +.   |   
  3.8e+06 |-+                                  +    +                       |   
          |                                                                 |   
  3.6e+06 +-----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Rong Chen


View attachment "config-5.8.0-rc2-00128-g31d8fcac00fcf" of type "text/plain" (158303 bytes)

View attachment "job-script" of type "text/plain" (8097 bytes)

View attachment "job.yaml" of type "text/plain" (5642 bytes)

View attachment "reproduce" of type "text/plain" (932 bytes)

Powered by blists - more mailing lists