lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210302062521.GB23892@xsang-OptiPlex-9020>
Date:   Tue, 2 Mar 2021 14:25:21 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Tim Chen <tim.c.chen@...ux.intel.com>
Cc:     0day robot <lkp@...el.com>, Ying Huang <ying.huang@...el.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        feng.tang@...el.com, zhengjun.xing@...el.com,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...e.cz>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Dave Hansen <dave.hansen@...el.com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org
Subject: [mm]  4f09feb8bf:  vm-scalability.throughput -4.3% regression


Greeting,

FYI, we noticed a -4.3% regression of vm-scalability.throughput due to commit:


commit: 4f09feb8bf083be3834080ddf3782aee12a7c3f7 ("mm: Force update of mem cgroup soft limit tree on usage excess")
url: https://github.com/0day-ci/linux/commits/Tim-Chen/Soft-limit-memory-management-bug-fixes/20210218-054228


in testcase: vm-scalability
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	runtime: 300s
	test: lru-file-readonce
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: The motivation behind this suite is to exercise functions and regions of the mm/ of the Linux kernel which are of interest to us.
test-url: https://git.kernel.org/cgit/linux/kernel/git/wfg/vm-scalability.git/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml
        bin/lkp run                    compatible-job.yaml

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/runtime/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/debian-10.4-x86_64-20200603.cgz/300s/lkp-csl-2ap4/lru-file-readonce/vm-scalability/0x5003006

commit: 
  f0812bba8b ("mm: Fix dropped memcg from mem cgroup soft limit tree")
  4f09feb8bf ("mm: Force update of mem cgroup soft limit tree on usage excess")

f0812bba8bbd02bf 4f09feb8bf083be3834080ddf37 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    131114            -4.2%     125626        vm-scalability.median
  25345052            -4.3%   24265993        vm-scalability.throughput
    224.30            +3.6%     232.43        vm-scalability.time.elapsed_time
    224.30            +3.6%     232.43        vm-scalability.time.elapsed_time.max
     31611            +4.5%      33046        vm-scalability.time.system_time
      5225            -2.5%       5095        vmstat.system.cs
     95143 ±  4%     +20.7%     114842 ± 15%  meminfo.Active
     94109 ±  4%     +20.9%     113806 ± 15%  meminfo.Active(anon)
      7347 ±  8%     +17.9%       8665 ± 11%  softirqs.CPU73.RCU
      6628 ±  7%     +21.3%       8040 ± 11%  softirqs.CPU77.RCU
      7378 ±  7%     +35.9%      10027 ± 20%  softirqs.CPU90.RCU
      6976 ±  4%     +31.5%       9173 ± 25%  softirqs.CPU95.RCU
      8983 ± 24%     +19.5%      10737 ±  4%  softirqs.CPU96.SCHED
    220617 ± 33%     -68.4%      69618 ±106%  numa-meminfo.node0.Inactive(anon)
      4442 ± 24%     -31.2%       3055 ± 18%  numa-meminfo.node0.PageTables
      2964 ± 45%    +153.1%       7504 ± 23%  numa-meminfo.node1.Active
      2620 ± 39%    +179.8%       7332 ± 25%  numa-meminfo.node1.Active(anon)
     37702 ±130%    +248.6%     131446 ± 60%  numa-meminfo.node1.AnonPages
     39649 ±131%    +368.8%     185879 ± 43%  numa-meminfo.node1.Inactive(anon)
      3078 ± 11%     +50.9%       4647 ± 21%  numa-meminfo.node1.PageTables
      4610 ± 59%   +1241.2%      61840 ± 78%  numa-meminfo.node1.Shmem
     23809 ±  4%     +20.6%      28704 ± 15%  proc-vmstat.nr_active_anon
    939.33 ±  4%      -8.1%     863.33 ±  3%  proc-vmstat.nr_isolated_file
     56262            +9.7%      61735 ±  8%  proc-vmstat.nr_shmem
     23811 ±  4%     +20.6%      28705 ± 15%  proc-vmstat.nr_zone_active_anon
     75883 ±  2%     +23.1%      93446 ± 19%  proc-vmstat.pgactivate
   1038398            +5.0%    1089871 ±  2%  proc-vmstat.pgfault
     65866            +3.5%      68179        proc-vmstat.pgreuse
  18338900 ±  6%      -8.5%   16783260 ±  4%  proc-vmstat.slabs_scanned
      1216 ±  9%      -8.2%       1116 ±  3%  interrupts.CPU145.CAL:Function_call_interrupts
     27.33 ± 16%    +195.1%      80.67 ± 72%  interrupts.CPU172.RES:Rescheduling_interrupts
    162.00 ±  4%     +21.0%     196.00 ± 17%  interrupts.CPU33.RES:Rescheduling_interrupts
    163.83 ±  3%     +15.8%     189.67 ± 14%  interrupts.CPU34.RES:Rescheduling_interrupts
    129.83 ±  2%     +19.8%     155.50 ±  9%  interrupts.CPU56.RES:Rescheduling_interrupts
     97.50 ± 10%     +40.0%     136.50 ± 20%  interrupts.CPU65.RES:Rescheduling_interrupts
    261.17 ± 44%     -44.2%     145.67 ± 18%  interrupts.CPU73.RES:Rescheduling_interrupts
     49.17 ± 48%    +133.9%     115.00 ± 36%  interrupts.CPU85.RES:Rescheduling_interrupts
     41.83 ± 24%    +144.2%     102.17 ± 52%  interrupts.CPU88.RES:Rescheduling_interrupts
     48.17 ± 38%    +215.2%     151.83 ± 54%  interrupts.CPU89.RES:Rescheduling_interrupts
     38.17 ± 15%    +106.1%      78.67 ± 39%  interrupts.CPU90.RES:Rescheduling_interrupts
     55160 ± 33%     -68.5%      17396 ±106%  numa-vmstat.node0.nr_inactive_anon
      3614 ±  8%     -17.7%       2974 ± 11%  numa-vmstat.node0.nr_mapped
      1105 ± 24%     -31.0%     762.67 ± 17%  numa-vmstat.node0.nr_page_table_pages
     55168 ± 33%     -68.5%      17402 ±106%  numa-vmstat.node0.nr_zone_inactive_anon
    663.00 ± 39%    +179.7%       1854 ± 25%  numa-vmstat.node1.nr_active_anon
      9426 ±130%    +248.2%      32821 ± 60%  numa-vmstat.node1.nr_anon_pages
      9914 ±131%    +368.5%      46447 ± 43%  numa-vmstat.node1.nr_inactive_anon
    764.00 ± 11%     +51.7%       1159 ± 20%  numa-vmstat.node1.nr_page_table_pages
      1162 ± 58%   +1233.7%      15500 ± 77%  numa-vmstat.node1.nr_shmem
    663.17 ± 39%    +179.6%       1854 ± 25%  numa-vmstat.node1.nr_zone_active_anon
      9920 ±131%    +368.3%      46454 ± 43%  numa-vmstat.node1.nr_zone_inactive_anon
      9.08 ±  2%     +22.4%      11.12 ± 12%  perf-stat.i.MPKI
 1.303e+10            -4.7%  1.242e+10        perf-stat.i.branch-instructions
      0.39            +0.1        0.50 ± 28%  perf-stat.i.branch-miss-rate%
 1.873e+08            +8.2%  2.027e+08        perf-stat.i.cache-misses
 5.924e+08            +7.4%  6.365e+08        perf-stat.i.cache-references
      5087            -2.6%       4957        perf-stat.i.context-switches
      6.02            +5.2%       6.33        perf-stat.i.cpi
      2001            -6.8%       1865        perf-stat.i.cycles-between-cache-misses
      0.05 ± 19%      +0.0        0.07 ± 24%  perf-stat.i.dTLB-load-miss-rate%
 1.544e+10            -4.2%  1.479e+10        perf-stat.i.dTLB-loads
 5.401e+09            -2.9%  5.247e+09        perf-stat.i.dTLB-stores
  24692490            +6.3%   26255880        perf-stat.i.iTLB-load-misses
 5.933e+10            -4.4%   5.67e+10        perf-stat.i.instructions
      1.89 ±  3%      -9.9%       1.70 ±  5%  perf-stat.i.major-faults
    179.72            -4.0%     172.60        perf-stat.i.metric.M/sec
  21954317            +7.3%   23554489        perf-stat.i.node-load-misses
  12269278            -3.4%   11855032        perf-stat.i.node-store-misses
  14458012            -3.3%   13976704        perf-stat.i.node-stores
     10.01           +12.0%      11.21        perf-stat.overall.MPKI
      0.32            +0.0        0.34 ±  4%  perf-stat.overall.branch-miss-rate%
      7.28            +5.3%       7.67        perf-stat.overall.cpi
      2307            -7.0%       2146        perf-stat.overall.cycles-between-cache-misses
      2408           -10.0%       2167        perf-stat.overall.instructions-per-iTLB-miss
      0.14            -5.0%       0.13        perf-stat.overall.ipc
     36.46            +2.1       38.51        perf-stat.overall.node-load-miss-rate%
 1.346e+10            -4.6%  1.285e+10        perf-stat.ps.branch-instructions
 1.936e+08            +8.2%  2.095e+08        perf-stat.ps.cache-misses
  6.14e+08            +7.2%  6.579e+08        perf-stat.ps.cache-references
      5172            -2.6%       5038        perf-stat.ps.context-switches
 1.597e+10            -4.1%  1.531e+10        perf-stat.ps.dTLB-loads
 5.575e+09            -2.9%  5.411e+09        perf-stat.ps.dTLB-stores
  25461461            +6.3%   27074171        perf-stat.ps.iTLB-load-misses
 6.131e+10            -4.3%  5.867e+10        perf-stat.ps.instructions
      1.76 ±  2%      -5.0%       1.68        perf-stat.ps.major-faults
  22688997            +7.3%   24343963        perf-stat.ps.node-load-misses
  12711574            -3.5%   12262917        perf-stat.ps.node-store-misses
  14988174            -3.4%   14480039        perf-stat.ps.node-stores
      0.01 ± 11%     +87.2%       0.01 ± 18%  perf-sched.sch_delay.avg.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
      0.00 ±114%    +353.6%       0.02 ± 47%  perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
      0.02 ± 17%    +338.9%       0.08 ± 62%  perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
      0.02 ± 49%  +3.1e+05%      76.43 ±185%  perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.kcompactd.kthread
      0.68 ±212%    +983.9%       7.35 ± 37%  perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
      3.16 ± 75%   +8842.7%     282.23 ±218%  perf-sched.sch_delay.max.ms.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
    197.84 ±  7%     -23.7%     150.97 ± 18%  perf-sched.total_wait_and_delay.average.ms
     13780 ±  6%     +75.6%      24203 ± 27%  perf-sched.total_wait_and_delay.count.ms
    197.81 ±  7%     -23.7%     150.88 ± 18%  perf-sched.total_wait_time.average.ms
      2.31 ±100%    +286.1%       8.90 ± 38%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.devkmsg_read.vfs_read.ksys_read
      2.32 ±100%    +284.4%       8.91 ± 38%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.do_syslog.part.0
    215.21 ±  4%     -62.0%      81.72 ± 21%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.do_task_dead.do_exit.do_group_exit
      0.91 ±  4%    +145.7%       2.25 ±  5%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.do_wait.kernel_wait4.__do_sys_wait4
    344.42           -92.3%      26.39 ± 33%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
    478.60           -36.5%     304.11 ± 31%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.schedule_timeout.kcompactd.kthread
    655.59 ±  3%      +9.7%     719.00 ±  2%  perf-sched.wait_and_delay.avg.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
    167.67 ±124%    +426.7%     883.17 ± 12%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.devkmsg_read.vfs_read.ksys_read
     11.17 ±  3%     +11.9%      12.50 ±  4%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.do_nanosleep.hrtimer_nanosleep.__x64_sys_nanosleep
    167.50 ±124%    +427.1%     882.83 ± 12%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.do_syslog.part.0
    303.50          +169.0%     816.50 ± 14%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.do_task_dead.do_exit.do_group_exit
    302.17          +128.4%     690.17        perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.do_wait.kernel_wait4.__do_sys_wait4
    323.33 ± 28%     -87.0%      42.00 ±141%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.truncate_inode_pages_range
    185.67 ±115%    +385.9%     902.17 ± 12%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
     26.17         +2201.9%     602.33        perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
     79.33           +71.2%     135.83 ± 32%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.schedule_timeout.kcompactd.kthread
    974.50 ± 19%    +390.9%       4783 ± 70%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
      5.23 ±104%  +63694.8%       3339 ± 66%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.devkmsg_read.vfs_read.ksys_read
      5.23 ±104%  +63704.2%       3339 ± 66%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.do_syslog.part.0
     21.62 ±  3%   +4530.4%       1001        perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.do_wait.kernel_wait4.__do_sys_wait4
      1019          +227.9%       3342 ± 66%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
    333.34 ± 70%    +963.4%       3544 ± 61%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
      1000          +365.0%       4654 ± 71%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
      5.03         +4104.3%     211.64 ± 41%  perf-sched.wait_and_delay.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
      3.46 ± 44%    +157.3%       8.90 ± 38%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.devkmsg_read.vfs_read.ksys_read
      3.47 ± 44%    +156.2%       8.90 ± 38%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.do_syslog.part.0
    215.20 ±  4%     -62.0%      81.71 ± 21%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.do_task_dead.do_exit.do_group_exit
      0.91 ±  4%    +146.2%       2.24 ±  5%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.do_wait.kernel_wait4.__do_sys_wait4
      0.01 ±149%   +1185.7%       0.09 ± 53%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
      0.51 ± 20%    +111.1%       1.08 ± 21%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.truncate_inode_pages_range
      0.01 ±146%  +9.6e+05%      86.17 ± 70%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.ww_mutex_lock
      1.73 ± 16%     +95.0%       3.37 ± 29%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.rcu_gp_kthread.kthread.ret_from_fork
    344.42           -92.3%      26.38 ± 33%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
    478.59           -36.6%     303.45 ± 31%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.schedule_timeout.kcompactd.kthread
    655.58 ±  3%      +9.7%     718.99 ±  2%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.smpboot_thread_fn.kthread.ret_from_fork
      7.49 ± 52%  +44482.5%       3339 ± 66%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.devkmsg_read.vfs_read.ksys_read
      7.49 ± 52%  +44481.2%       3339 ± 66%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.do_syslog.part.0
     21.62 ±  3%   +4531.3%       1001        perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.do_wait.kernel_wait4.__do_sys_wait4
      0.02 ±145%  +12809.8%       1.98 ±134%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.exit_to_user_mode_prepare.syscall_exit_to_user_mode.entry_SYSCALL_64_after_hwframe
      1019          +227.9%       3342 ± 66%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.pipe_read.new_sync_read.vfs_read
      0.02 ±142%  +2.9e+06%     436.52 ± 67%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.ww_mutex_lock
    413.83 ± 43%    +756.5%       3544 ± 61%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.ep_poll.do_epoll_wait
      1000          +365.1%       4654 ± 71%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.schedule_hrtimeout_range_clock.poll_schedule_timeout.constprop
      5.02         +4118.6%     211.60 ± 41%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.schedule_timeout.rcu_gp_kthread.kthread
     30.68 ±  3%      -2.6       28.08 ±  2%  perf-profile.calltrace.cycles-pp.lock_page_lruvec_irqsave.__pagevec_lru_add.lru_cache_add.add_to_page_cache_lru.page_cache_ra_unbounded
     30.67 ±  3%      -2.6       28.07 ±  2%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.lock_page_lruvec_irqsave.__pagevec_lru_add.lru_cache_add.add_to_page_cache_lru
     30.63 ±  3%      -2.6       28.03 ±  2%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.lock_page_lruvec_irqsave.__pagevec_lru_add.lru_cache_add
     31.63 ±  3%      -2.6       29.07 ±  2%  perf-profile.calltrace.cycles-pp.__pagevec_lru_add.lru_cache_add.add_to_page_cache_lru.page_cache_ra_unbounded.generic_file_buffered_read_get_pages
     31.68 ±  3%      -2.6       29.11 ±  2%  perf-profile.calltrace.cycles-pp.lru_cache_add.add_to_page_cache_lru.page_cache_ra_unbounded.generic_file_buffered_read_get_pages.generic_file_buffered_read
     49.97 ±  2%      -2.6       47.42        perf-profile.calltrace.cycles-pp.__alloc_pages_nodemask.page_cache_ra_unbounded.generic_file_buffered_read_get_pages.generic_file_buffered_read.xfs_file_buffered_aio_read
      5.62 ±  5%      -0.5        5.10 ±  3%  perf-profile.calltrace.cycles-pp.read_pages.page_cache_ra_unbounded.generic_file_buffered_read_get_pages.generic_file_buffered_read.xfs_file_buffered_aio_read
      5.57 ±  5%      -0.5        5.05 ±  3%  perf-profile.calltrace.cycles-pp.iomap_readahead_actor.iomap_apply.iomap_readahead.read_pages.page_cache_ra_unbounded
      5.61 ±  5%      -0.5        5.09 ±  3%  perf-profile.calltrace.cycles-pp.iomap_readahead.read_pages.page_cache_ra_unbounded.generic_file_buffered_read_get_pages.generic_file_buffered_read
      5.60 ±  5%      -0.5        5.09 ±  3%  perf-profile.calltrace.cycles-pp.iomap_apply.iomap_readahead.read_pages.page_cache_ra_unbounded.generic_file_buffered_read_get_pages
      5.34 ±  5%      -0.5        4.83 ±  3%  perf-profile.calltrace.cycles-pp.iomap_readpage_actor.iomap_readahead_actor.iomap_apply.iomap_readahead.read_pages
      0.70 ± 14%      +0.3        0.95 ±  9%  perf-profile.calltrace.cycles-pp.try_charge.mem_cgroup_charge.__add_to_page_cache_locked.add_to_page_cache_lru.page_cache_ra_unbounded
      0.38 ± 71%      +0.3        0.70 ±  9%  perf-profile.calltrace.cycles-pp.page_counter_try_charge.try_charge.mem_cgroup_charge.__add_to_page_cache_locked.add_to_page_cache_lru
      0.00            +3.5        3.52 ±  9%  perf-profile.calltrace.cycles-pp.memcg_check_events.mem_cgroup_charge.__add_to_page_cache_locked.add_to_page_cache_lru.page_cache_ra_unbounded
     38.12 ±  2%      +3.8       41.90 ±  2%  perf-profile.calltrace.cycles-pp.add_to_page_cache_lru.page_cache_ra_unbounded.generic_file_buffered_read_get_pages.generic_file_buffered_read.xfs_file_buffered_aio_read
      5.32 ± 19%      +6.3       11.60 ±  6%  perf-profile.calltrace.cycles-pp.mem_cgroup_charge.__add_to_page_cache_locked.add_to_page_cache_lru.page_cache_ra_unbounded.generic_file_buffered_read_get_pages
      6.41 ± 17%      +6.3       12.76 ±  5%  perf-profile.calltrace.cycles-pp.__add_to_page_cache_locked.add_to_page_cache_lru.page_cache_ra_unbounded.generic_file_buffered_read_get_pages.generic_file_buffered_read
     76.16 ±  2%      -5.1       71.05        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     38.16 ±  5%      -3.5       34.66 ±  3%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
     50.11 ±  2%      -2.6       47.51        perf-profile.children.cycles-pp.__alloc_pages_nodemask
     30.76 ±  3%      -2.6       28.17 ±  2%  perf-profile.children.cycles-pp.lock_page_lruvec_irqsave
     31.70 ±  3%      -2.6       29.14 ±  2%  perf-profile.children.cycles-pp.__pagevec_lru_add
     31.72 ±  3%      -2.6       29.16 ±  2%  perf-profile.children.cycles-pp.lru_cache_add
      4.13 ± 16%      -1.4        2.69 ± 19%  perf-profile.children.cycles-pp.get_page_from_freelist
      3.98 ± 17%      -1.4        2.55 ± 20%  perf-profile.children.cycles-pp.rmqueue
      3.73 ± 18%      -1.4        2.32 ± 22%  perf-profile.children.cycles-pp.rmqueue_bulk
      3.60 ± 18%      -1.4        2.19 ± 22%  perf-profile.children.cycles-pp._raw_spin_lock
      5.62 ±  5%      -0.5        5.09 ±  2%  perf-profile.children.cycles-pp.read_pages
      5.57 ±  5%      -0.5        5.05 ±  3%  perf-profile.children.cycles-pp.iomap_readahead_actor
      5.61 ±  5%      -0.5        5.09 ±  3%  perf-profile.children.cycles-pp.iomap_readahead
      5.60 ±  5%      -0.5        5.08 ±  3%  perf-profile.children.cycles-pp.iomap_apply
      5.36 ±  5%      -0.5        4.85 ±  3%  perf-profile.children.cycles-pp.iomap_readpage_actor
      2.93 ±  5%      -0.3        2.65 ±  3%  perf-profile.children.cycles-pp.memset_erms
      2.23 ±  5%      -0.2        2.01 ±  3%  perf-profile.children.cycles-pp.iomap_set_range_uptodate
      0.73 ±  2%      -0.1        0.67        perf-profile.children.cycles-pp.__list_del_entry_valid
      0.06 ± 11%      +0.0        0.08 ±  7%  perf-profile.children.cycles-pp.uncharge_page
      0.31 ±  7%      +0.0        0.35 ±  4%  perf-profile.children.cycles-pp.__count_memcg_events
      0.18 ± 10%      +0.0        0.23 ±  4%  perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
      0.41 ±  7%      +0.0        0.46 ±  3%  perf-profile.children.cycles-pp.uncharge_batch
      0.16 ± 14%      +0.1        0.24 ±  7%  perf-profile.children.cycles-pp.propagate_protected_usage
      0.51 ±  7%      +0.1        0.60 ±  4%  perf-profile.children.cycles-pp.__mod_memcg_state
      0.97 ±  8%      +0.1        1.10 ±  3%  perf-profile.children.cycles-pp.__mod_memcg_lruvec_state
      0.68 ± 14%      +0.2        0.93 ±  6%  perf-profile.children.cycles-pp.page_counter_try_charge
      0.91 ± 14%      +0.3        1.25 ±  6%  perf-profile.children.cycles-pp.try_charge
      0.05 ±  8%      +3.5        3.55 ±  9%  perf-profile.children.cycles-pp.memcg_check_events
     38.12 ±  2%      +3.8       41.91 ±  2%  perf-profile.children.cycles-pp.add_to_page_cache_lru
      5.34 ± 19%      +6.3       11.62 ±  6%  perf-profile.children.cycles-pp.mem_cgroup_charge
      6.42 ± 17%      +6.4       12.77 ±  5%  perf-profile.children.cycles-pp.__add_to_page_cache_locked
     76.16 ±  2%      -5.1       71.05        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      2.91 ±  5%      -0.3        2.63 ±  3%  perf-profile.self.cycles-pp.memset_erms
      2.20 ±  5%      -0.2        1.98 ±  3%  perf-profile.self.cycles-pp.iomap_set_range_uptodate
      0.20 ±  8%      -0.0        0.17 ±  4%  perf-profile.self.cycles-pp.rmqueue_bulk
      0.13 ±  4%      -0.0        0.11 ±  6%  perf-profile.self.cycles-pp.__remove_mapping
      0.06 ± 11%      +0.0        0.08 ±  7%  perf-profile.self.cycles-pp.uncharge_page
      0.31 ±  7%      +0.0        0.35 ±  4%  perf-profile.self.cycles-pp.__count_memcg_events
      0.16 ± 14%      +0.1        0.24 ±  9%  perf-profile.self.cycles-pp.propagate_protected_usage
      0.23 ± 13%      +0.1        0.32 ±  6%  perf-profile.self.cycles-pp.try_charge
      0.51 ±  7%      +0.1        0.60 ±  3%  perf-profile.self.cycles-pp.__mod_memcg_state
      0.58 ± 13%      +0.2        0.75 ±  6%  perf-profile.self.cycles-pp.page_counter_try_charge
      2.25 ± 25%      +1.8        4.05 ±  6%  perf-profile.self.cycles-pp.mem_cgroup_charge
      0.01 ±223%      +3.5        3.54 ±  9%  perf-profile.self.cycles-pp.memcg_check_events


                                                                                
                                vm-scalability.median                           
                                                                                
  134000 +------------------------------------------------------------------+   
         |                                         +.                       |   
  132000 |-+ .+          .+          +       .+   +  +  +                   |   
         |  +  :   +.++.+ :    +    + :     +  : +    :+ :                  |   
  130000 |-+:  :  +        :  + :  +  :     :  : :    +  :                  |   
         |.+    ++    O O  :.+  +.+ O  +.+.+O   +  O      +  O              |   
  128000 |-+     O         +   O                                            |   
         |  O      O O    O     O             O                      O    O |   
  126000 |-+  O O                    O               O         O    O       |   
         | O               O O                          O         O         |   
  124000 |-+                             O       O    O    O    O        O  |   
         |                        O                                    O    |   
  122000 |-+                                                                |   
         |                                      O         O                 |   
  120000 +------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Oliver Sang


View attachment "config-5.11.0-00002-g4f09feb8bf08" of type "text/plain" (172445 bytes)

View attachment "job-script" of type "text/plain" (7952 bytes)

View attachment "job.yaml" of type "text/plain" (5151 bytes)

View attachment "reproduce" of type "text/plain" (18482 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ