lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20210908054503.GB839@xsang-OptiPlex-9020>
Date:   Wed, 8 Sep 2021 13:45:03 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Alex Shi <alex.shi@...ux.alibaba.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Hugh Dickins <hughd@...gle.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Alexander Duyck <alexander.duyck@...il.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        "Chen, Rong A" <rong.a.chen@...el.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        "Huang, Ying" <ying.huang@...el.com>, Jann Horn <jannh@...gle.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
        Matthew Wilcox <willy@...radead.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...nel.org>,
        Michal Hocko <mhocko@...e.com>,
        Mika Penttilä <mika.penttila@...tfour.com>,
        Minchan Kim <minchan@...nel.org>,
        Shakeel Butt <shakeelb@...gle.com>, Tejun Heo <tj@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Wei Yang <richard.weiyang@...il.com>,
        Yang Shi <yang.shi@...ux.alibaba.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, feng.tang@...el.com, zhengjun.xing@...ux.intel.com
Subject: [mm/lru]  75cc3c9161:  fio.read_iops -4.7% regression



Greeting,

FYI, we noticed a -4.7% regression of fio.read_iops due to commit:


commit: 75cc3c9161cd95f43ebf6c6a938d4d98ab195bbd ("mm/lru: move lock into lru_note_cost")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: fio-basic
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 256G memory
with following parameters:

	disk: 2pmem
	fs: ext4
	runtime: 200s
	nr_task: 50%
	time_based: tb
	rw: randread
	bs: 4k
	ioengine: mmap
	test_size: 200G
	cpufreq_governor: performance
	ucode: 0x5003006

test-description: Fio is a tool that will spawn a number of threads or processes doing a particular type of I/O action as specified by the user.
test-url: https://github.com/axboe/fio



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install                job.yaml  # job file is attached in this email
        bin/lkp split-job --compatible job.yaml  # generate the yaml file for lkp run
        bin/lkp run                    generated-yaml-file

=========================================================================================
bs/compiler/cpufreq_governor/disk/fs/ioengine/kconfig/nr_task/rootfs/runtime/rw/tbox_group/test_size/testcase/time_based/ucode:
  4k/gcc-9/performance/2pmem/ext4/mmap/x86_64-rhel-8.3/50%/debian-10.4-x86_64-20200603.cgz/200s/randread/lkp-csl-2sp6/200G/fio-basic/tb/0x5003006

commit: 
  c7c7b80c39 ("mm/swap.c: fold vm event PGROTATED into pagevec_move_tail_fn")
  75cc3c9161 ("mm/lru: move lock into lru_note_cost")

c7c7b80c39a18d99 75cc3c9161cd95f43ebf6c6a938 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      0.06            +0.0        0.06        fio.latency_20ms%
      0.17 ±  6%      -0.1        0.10 ± 12%  fio.latency_250us%
      2.42 ±  5%      +0.3        2.73 ±  4%  fio.latency_50us%
     10762            -4.7%      10251        fio.read_bw_MBps
     15928            +5.4%      16792        fio.read_clat_mean_us
    620449 ±  4%     +15.2%     714702 ±  5%  fio.read_clat_stddev
   2755207            -4.7%    2624496        fio.read_iops
 4.356e+09            -4.7%   4.15e+09        fio.time.file_system_inputs
    548995           -13.6%     474105        fio.time.involuntary_context_switches
 5.445e+08            -4.7%  5.188e+08        fio.time.major_page_faults
 5.512e+08            -4.7%  5.252e+08        fio.workload
      2.60            -4.2%       2.50        iostat.cpu.user
    993.70 ±  5%      -9.6%     898.57 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.max
    148.41            -1.2%     146.66        turbostat.RAMWatt
    213.43 ±  3%     -34.7%     139.43 ±  5%  numa-vmstat.node0.nr_isolated_file
    210.57 ±  5%     -33.0%     141.14 ±  5%  numa-vmstat.node1.nr_isolated_file
  10692349            -4.9%   10171517        vmstat.io.bi
      7671            -9.8%       6917        vmstat.system.cs
     42.20 ±  3%      +7.6%      45.42 ±  3%  perf-sched.total_wait_and_delay.average.ms
     42.18 ±  3%      +7.6%      45.40 ±  3%  perf-sched.total_wait_time.average.ms
     10233 ±  2%     -17.2%       8477 ±  3%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_slowpath
    459.86 ±  7%     +36.0%     625.57 ± 14%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.down_read
     20707 ±  4%      -9.2%      18791 ±  5%  perf-sched.wait_and_delay.count.__sched_text_start.__sched_text_start.worker_thread.kthread.ret_from_fork
      0.01 ±  8%  +11171.4%       1.24 ±178%  perf-sched.wait_time.avg.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
      0.02 ± 71%  +1.7e+05%      36.36 ±177%  perf-sched.wait_time.max.ms.__sched_text_start.__sched_text_start.preempt_schedule_common._cond_resched.__alloc_pages_nodemask
  46297815            -1.3%   45718611        interrupts.CAL:Function_call_interrupts
    536.29 ± 11%     -18.9%     434.71 ± 12%  interrupts.CPU13.RES:Rescheduling_interrupts
    562.71 ± 13%     -18.5%     458.43 ±  7%  interrupts.CPU16.RES:Rescheduling_interrupts
    757634 ±  9%     -13.6%     654501 ±  6%  interrupts.CPU16.TLB:TLB_shootdowns
    536.00 ± 13%     -18.3%     438.00 ±  5%  interrupts.CPU17.RES:Rescheduling_interrupts
    550.57 ±  9%     -21.3%     433.57 ±  8%  interrupts.CPU18.RES:Rescheduling_interrupts
      4251 ± 18%     +60.7%       6833 ±  9%  interrupts.CPU25.NMI:Non-maskable_interrupts
      4251 ± 18%     +60.7%       6833 ±  9%  interrupts.CPU25.PMI:Performance_monitoring_interrupts
    506.29 ± 11%     -21.9%     395.57 ± 12%  interrupts.CPU35.RES:Rescheduling_interrupts
    772187 ± 11%     -17.0%     640700 ±  9%  interrupts.CPU35.TLB:TLB_shootdowns
    752779 ± 10%     -23.8%     573337 ± 19%  interrupts.CPU37.TLB:TLB_shootdowns
    374466            -4.8%     356349        proc-vmstat.allocstall_movable
      8293 ±  2%      -6.9%       7723 ±  2%  proc-vmstat.kswapd_low_wmark_hit_quickly
    426.29           -34.5%     279.14 ±  4%  proc-vmstat.nr_isolated_file
 4.288e+08            -5.8%  4.039e+08        proc-vmstat.numa_hit
 4.287e+08            -5.8%  4.038e+08        proc-vmstat.numa_local
      8297 ±  2%      -6.9%       7727 ±  2%  proc-vmstat.pageoutrun
  20856484            -4.5%   19927281        proc-vmstat.pgalloc_dma32
  5.25e+08            -4.7%      5e+08        proc-vmstat.pgalloc_normal
  1.09e+09            -4.7%  1.038e+09        proc-vmstat.pgfault
 5.355e+08            -4.8%  5.097e+08        proc-vmstat.pgfree
 5.445e+08            -4.7%  5.187e+08        proc-vmstat.pgmajfault
 2.178e+09            -4.7%  2.075e+09        proc-vmstat.pgpgin
 9.606e+08            -5.0%  9.122e+08        proc-vmstat.pgscan_direct
 1.079e+09            -4.7%  1.028e+09        proc-vmstat.pgscan_file
 4.938e+08            -4.8%  4.698e+08        proc-vmstat.pgsteal_direct
 5.345e+08            -4.8%  5.087e+08        proc-vmstat.pgsteal_file
  40747069 ±  2%      -4.6%   38881007        proc-vmstat.pgsteal_kswapd
  33706519            -4.6%   32144491        proc-vmstat.workingset_refault_file
     22.69 ±  9%     -11.5       11.23 ±  9%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node
     21.89 ±  9%     -10.7       11.20 ±  9%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
     11.69 ± 10%      -2.6        9.05 ±  9%  perf-profile.calltrace.cycles-pp.shrink_page_list.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
      4.06 ± 12%      -1.8        2.27 ±  9%  perf-profile.calltrace.cycles-pp.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_lruvec.shrink_node
      4.06 ± 12%      -1.8        2.27 ±  9%  perf-profile.calltrace.cycles-pp.arch_tlbbatch_flush.try_to_unmap_flush.shrink_page_list.shrink_inactive_list.shrink_lruvec
      4.06 ± 12%      -1.8        2.27 ±  9%  perf-profile.calltrace.cycles-pp.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.shrink_page_list.shrink_inactive_list
      3.96 ± 12%      -1.7        2.22 ± 10%  perf-profile.calltrace.cycles-pp.smp_call_function_many_cond.on_each_cpu_cond_mask.arch_tlbbatch_flush.try_to_unmap_flush.shrink_page_list
      0.00           +12.5       12.46 ±  8%  perf-profile.calltrace.cycles-pp.lru_note_cost.shrink_inactive_list.shrink_lruvec.shrink_node.do_try_to_free_pages
      0.00           +12.5       12.50 ± 10%  perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irq.lru_note_cost.shrink_inactive_list.shrink_lruvec
      0.00           +12.6       12.56 ± 10%  perf-profile.calltrace.cycles-pp._raw_spin_lock_irq.lru_note_cost.shrink_inactive_list.shrink_lruvec.shrink_node
     12.95 ± 10%      -2.9       10.09 ±  9%  perf-profile.children.cycles-pp.shrink_page_list
      4.34 ± 12%      -1.9        2.46 ±  9%  perf-profile.children.cycles-pp.try_to_unmap_flush
      4.34 ± 12%      -1.9        2.46 ±  9%  perf-profile.children.cycles-pp.arch_tlbbatch_flush
      4.34 ± 12%      -1.9        2.46 ±  9%  perf-profile.children.cycles-pp.on_each_cpu_cond_mask
      4.24 ± 12%      -1.8        2.41 ± 10%  perf-profile.children.cycles-pp.smp_call_function_many_cond
      2.77 ± 10%      -0.3        2.44 ± 10%  perf-profile.children.cycles-pp.page_referenced
      2.09 ± 10%      -0.2        1.84 ± 10%  perf-profile.children.cycles-pp.page_referenced_one
      1.98 ± 10%      -0.2        1.75 ± 10%  perf-profile.children.cycles-pp.page_vma_mapped_walk
      0.98 ± 10%      -0.1        0.85 ± 10%  perf-profile.children.cycles-pp.isolate_lru_pages
      0.37 ± 11%      -0.0        0.32 ±  9%  perf-profile.children.cycles-pp.sync_regs
      0.09 ± 15%      -0.0        0.05 ±  6%  perf-profile.children.cycles-pp.smp_call_function_single
      0.06 ± 10%     +13.0       13.02 ±  8%  perf-profile.children.cycles-pp.lru_note_cost
      4.09 ± 12%      -1.8        2.28 ± 10%  perf-profile.self.cycles-pp.smp_call_function_many_cond
      3.96 ± 11%      -0.5        3.42 ± 10%  perf-profile.self.cycles-pp.filemap_map_pages
      1.58 ± 10%      -0.2        1.40 ± 10%  perf-profile.self.cycles-pp.page_vma_mapped_walk
      0.30 ± 11%      -0.1        0.23 ± 16%  perf-profile.self.cycles-pp.__remove_mapping
      0.09 ± 18%      -0.0        0.04 ± 40%  perf-profile.self.cycles-pp.smp_call_function_single
      0.36 ± 10%      -0.0        0.32 ±  9%  perf-profile.self.cycles-pp.sync_regs
      0.16 ± 10%      -0.0        0.13 ± 14%  perf-profile.self.cycles-pp.move_pages_to_lru
      0.06 ± 10%      +0.0        0.08 ± 10%  perf-profile.self.cycles-pp.lru_note_cost
      0.12 ± 12%      +0.1        0.17 ± 14%  perf-profile.self.cycles-pp._raw_spin_lock_irq
 1.311e+10            -3.7%  1.262e+10        perf-stat.i.branch-instructions
 1.148e+08            -4.1%  1.101e+08        perf-stat.i.branch-misses
 4.887e+08            -5.1%  4.637e+08        perf-stat.i.cache-misses
 6.388e+08            -4.3%  6.115e+08        perf-stat.i.cache-references
      7634           -10.1%       6867        perf-stat.i.context-switches
      2.24            +4.1%       2.33        perf-stat.i.cpi
    329.58            +5.5%     347.60        perf-stat.i.cycles-between-cache-misses
 1.598e+10            -3.8%  1.536e+10        perf-stat.i.dTLB-loads
 8.453e+09            -4.6%  8.062e+09        perf-stat.i.dTLB-stores
   2862085            -3.2%    2770116        perf-stat.i.iTLB-loads
 6.452e+10            -3.9%  6.201e+10        perf-stat.i.instructions
      0.46            -3.9%       0.45        perf-stat.i.ipc
   2707617            -4.7%    2579975        perf-stat.i.major-faults
    398.00            -4.0%     381.98        perf-stat.i.metric.M/sec
  70003070 ±  2%      -5.1%   66402385 ±  2%  perf-stat.i.node-stores
   2711171            -4.7%    2583543        perf-stat.i.page-faults
      2.15            +4.1%       2.24        perf-stat.overall.cpi
    283.80            +5.4%     299.24        perf-stat.overall.cycles-between-cache-misses
      0.47            -3.9%       0.45        perf-stat.overall.ipc
     23424            +1.1%      23674        perf-stat.overall.path-length
 1.304e+10            -3.7%  1.256e+10        perf-stat.ps.branch-instructions
 1.142e+08            -4.1%  1.095e+08        perf-stat.ps.branch-misses
 4.864e+08            -5.2%  4.613e+08        perf-stat.ps.cache-misses
 6.357e+08            -4.3%  6.083e+08        perf-stat.ps.cache-references
      7594           -10.0%       6832        perf-stat.ps.context-switches
  1.59e+10            -3.9%  1.528e+10        perf-stat.ps.dTLB-loads
 8.412e+09            -4.7%   8.02e+09        perf-stat.ps.dTLB-stores
   2847228            -3.2%    2755506        perf-stat.ps.iTLB-loads
  6.42e+10            -3.9%  6.169e+10        perf-stat.ps.instructions
   2694573            -4.8%    2566453        perf-stat.ps.major-faults
  69668649 ±  2%      -5.2%   66048410 ±  2%  perf-stat.ps.node-stores
   2698105            -4.7%    2570000        perf-stat.ps.page-faults
 1.291e+13            -3.7%  1.243e+13        perf-stat.total.instructions


                                                                                
                                  fio.read_bw_MBps                              
                                                                                
  11000 +-------------------------------------------------------------------+   
        |      + : :   +                   +       +  ::   +.+.             |   
  10800 |-+   : :: :  : :                 ::      : +: : .+    ++.   +. .+ .|   
        |.++. : +   + : : +. .++.+.+ .+.+ : :.+.+ :  +  +         +.+  +  + |   
  10600 |-+  +       +   +  +       +    +  +    +                          |   
        |                                                                   |   
  10400 |-+                                                                 |   
        |      O               O                   O OO   OO   OO           |   
  10200 |-+O      O  O    O         O   O  O    O       O    O              |   
        |           O  O                    O    O                          |   
  10000 |-O     O        O         O                                        |   
        |    O                   O    O  O    O                             |   
   9800 |-+                 O O                                             |   
        |                                                                   |   
   9600 +-------------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                     fio.read_iops                              
                                                                                
  2.85e+06 +----------------------------------------------------------------+   
           |         +                                 +                    |   
   2.8e+06 |-+      ::   +                             ::   +.+             |   
           |     +. : :  ::                 +       +.: :   :  :     .+ .+  |   
  2.75e+06 |-++. : +  : : :         +.   +. :+  .+. : +  :.+   +.++.+  +  +.|   
           |.+  +     +.+  ++.+.++.+  +.+  +  ++   +     +                  |   
   2.7e+06 |-+                                                              |   
           |                                                                |   
  2.65e+06 |-+   O                                  O OO   O                |   
           |  O      O           O       O  O    O          O  O O          |   
   2.6e+06 |-+          OO  O         O       O    O     O    O             |   
           |       O  O             O                                       |   
  2.55e+06 |-O  O          O               O   O                            |   
           |                    O  O    O                                   |   
   2.5e+06 +----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                                     fio.workload                               
                                                                                
  5.7e+08 +-----------------------------------------------------------------+   
          |         +                                  +                    |   
  5.6e+08 |-+      ::   +                             : :   +.+             |   
          |     +. : :  ::                  +       + : :  +   :     .+ .+  |   
  5.5e+08 |-++. : +  : : :         .+   .+ + :  .+ + +   :+    +.++.+  +  +.|   
          |.+  +     +.+  +.++.+.++  +.+  +  +.+  +      +                  |   
  5.4e+08 |-+                                                               |   
          |                                                                 |   
  5.3e+08 |-+   O                                   OO O  O                 |   
          |  O      O            O       O  O    O          O  O O          |   
  5.2e+08 |-+          OO   O        O       O    O      O    O             |   
          |       O  O              O                                       |   
  5.1e+08 |-O  O          O               O    O                            |   
          |                    O  O    O                                    |   
    5e+08 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                              fio.time.major_page_faults                        
                                                                                
  5.6e+08 +-----------------------------------------------------------------+   
          |         +                                  +                    |   
  5.5e+08 |-+   +  ::   +                   +       + : :  .+.+             |   
          |  +. :+ : :  ::              .+ +:   .+ + :: : +    +.++.+.+ .++.|   
  5.4e+08 |.+  +  +  +.+ : .+ .+.++.+ .+  +  :.+  +  +   +             +    |   
          |               +  +       +       +                              |   
  5.3e+08 |-+                                                               |   
          |     O                                   OO O  O                 |   
  5.2e+08 |-+O      O            O       O  O    O          O  O O          |   
          |            OO   O        O                   O    O             |   
  5.1e+08 |-+        O              O        O    O                         |   
          | O  O  O       O               O    O                            |   
    5e+08 |-+                  O  O    O                                    |   
          |                  O                                              |   
  4.9e+08 +-----------------------------------------------------------------+   
                                                                                
                                                                                                                                                                
                              fio.time.file_system_inputs                       
                                                                                
  4.45e+09 +----------------------------------------------------------------+   
   4.4e+09 |-+      ::   +                             ::   +.+             |   
           |     +. : :  ::                 +       +.: :  +   :.+   .+ .+  |   
  4.35e+09 |-++. : +  : : :         +.   +. :+  .+. : +  :+    +  +.+  +  +.|   
   4.3e+09 |.+  +     +.+  ++.+.++.+  +.+  +  ++   +     +                  |   
           |                                                                |   
  4.25e+09 |-+                                                              |   
   4.2e+09 |-+                                         O   O                |   
  4.15e+09 |-+O  O               O                  O O     O  O            |   
           |         O  O   O         O  O  O    O       O    O  O          |   
   4.1e+09 |-+        O  O                    O    O                        |   
  4.05e+09 |-O     O       O        O                                       |   
           |    O                  O    O  O   O                            |   
     4e+09 |-+                O O                                           |   
  3.95e+09 +----------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample



Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.10.0-03419-g75cc3c9161cd" of type "text/plain" (171273 bytes)

View attachment "job-script" of type "text/plain" (8573 bytes)

View attachment "job.yaml" of type "text/plain" (5858 bytes)

View attachment "reproduce" of type "text/plain" (915 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ