lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 15 Jun 2022 16:36:29 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Yu Kuai <yukuai3@...wei.com>
Cc:     0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
        linux-fsdevel@...r.kernel.org, lkp@...ts.01.org,
        ying.huang@...el.com, feng.tang@...el.com,
        zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com,
        guobing.chen@...el.com, ming.a.chen@...el.com, frank.du@...el.com,
        Shuhua.Fan@...el.com, wangyang.guo@...el.com,
        Wenhuan.Huang@...el.com, jessica.ji@...el.com, shan.kang@...el.com,
        guangli.li@...el.com, tiejun.li@...el.com, yu.ma@...el.com,
        dapeng1.mi@...el.com, jiebin.sun@...el.com, gengxin.xie@...el.com,
        fan.zhao@...el.com, willy@...radead.org, akpm@...ux-foundation.org,
        kent.overstreet@...il.com, axboe@...nel.dk, linux-mm@...ck.org,
        yukuai3@...wei.com, yi.zhang@...wei.com
Subject: [mm/filemap]  8b157c14b5:
 phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s
 -8.1% regression



Greeting,

FYI, we noticed a -8.1% regression of phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s due to commit:


commit: 8b157c14b505f861cf8da783ff89f679a0e50abe ("[PATCH -next] mm/filemap: fix that first page is not mark accessed in filemap_read()")
url: https://github.com/intel-lab-lkp/linux/commits/Yu-Kuai/mm-filemap-fix-that-first-page-is-not-mark-accessed-in-filemap_read/20220602-161035
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/linux-fsdevel/20220602082129.2805890-1-yukuai3@huawei.com

in testcase: phoronix-test-suite
on test machine: 96 threads 2 sockets Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz with 512G memory
with following parameters:

	test: fio-1.14.1
	option_a: Sequential Read
	option_b: Linux AIO
	option_c: Yes
	option_d: Yes
	option_e: 4KB
	option_f: Default Test Directory
	cpufreq_governor: performance
	ucode: 0x500320a

test-description: The Phoronix Test Suite is the most comprehensive testing and benchmarking platform available that provides an extensible framework for which new tests can be easily added.
test-url: http://www.phoronix-test-suite.com/



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.

=========================================================================================
compiler/cpufreq_governor/kconfig/option_a/option_b/option_c/option_d/option_e/option_f/rootfs/tbox_group/test/testcase/ucode:
  gcc-11/performance/x86_64-rhel-8.3/Sequential Read/Linux AIO/Yes/Yes/4KB/Default Test Directory/debian-x86_64-phoronix/lkp-csl-2sp7/fio-1.14.1/phoronix-test-suite/0x500320a

commit: 
  2408f14000 ("Merge branch 'mm-nonmm-unstable' into mm-everything")
  8b157c14b5 ("mm/filemap: fix that first page is not mark accessed in filemap_read()")

2408f140000f9597 8b157c14b505f861cf8da783ff8 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
    481388            -8.1%     442333        phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.iops
      1880            -8.1%       1727        phoronix-test-suite.fio.SequentialRead.LinuxAIO.Yes.Yes.4KB.DefaultTestDirectory.mb_s
 2.894e+08            -8.1%  2.659e+08        phoronix-test-suite.time.file_system_inputs
      0.11 ± 22%      -0.0        0.08        mpstat.cpu.all.soft%
    292.39 ± 35%     -35.3%     189.30 ±  8%  sched_debug.cpu.clock_task.stddev
    933030           +47.4%    1374932 ±  2%  numa-meminfo.node0.Active
     92985 ± 16%    +478.0%     537464 ±  6%  numa-meminfo.node0.Active(file)
     23246 ± 16%    +475.4%     133769 ±  6%  numa-vmstat.node0.nr_active_file
     23246 ± 16%    +475.4%     133769 ±  6%  numa-vmstat.node0.nr_zone_active_file
   1181131            -8.1%    1085364        vmstat.io.bi
     20529            -7.4%      19019        vmstat.system.cs
    954480           +45.1%    1384840 ±  3%  meminfo.Active
    112134          +386.0%     544959 ±  7%  meminfo.Active(file)
   2756213           -13.9%    2371792        meminfo.Inactive
   1492877           -25.8%    1108430        meminfo.Inactive(file)
     84.17 ± 10%     -11.7%      74.33        turbostat.Avg_MHz
      4.72 ± 18%      -0.9        3.84        turbostat.Busy%
    854421 ±133%     -82.0%     154039 ± 20%  turbostat.C1
      0.49 ±155%      -0.4        0.06 ± 11%  turbostat.C1%
     28033          +386.2%     136307 ±  7%  proc-vmstat.nr_active_file
    373247           -25.8%     277108        proc-vmstat.nr_inactive_file
     28033          +386.2%     136308 ±  7%  proc-vmstat.nr_zone_active_file
    373247           -25.8%     277108        proc-vmstat.nr_zone_inactive_file
  40703167 ±  2%      -8.5%   37255189        proc-vmstat.numa_hit
  40122593            -7.5%   37096628        proc-vmstat.numa_local
    316253        +10501.8%   33528470        proc-vmstat.pgactivate
  40072448            -7.3%   37140540        proc-vmstat.pgalloc_normal
  39689252            -7.5%   36696525        proc-vmstat.pgfree
 1.447e+08            -8.1%   1.33e+08        proc-vmstat.pgpgin
     22.95 ± 52%     -53.5%      10.67        perf-stat.i.MPKI
 1.088e+09            -3.3%  1.052e+09        perf-stat.i.branch-instructions
  14531811 ± 29%     -30.9%   10047658        perf-stat.i.branch-misses
  31350962            -9.2%   28459348        perf-stat.i.cache-misses
  86567058 ± 24%     -29.3%   61243543        perf-stat.i.cache-references
     21004            -7.6%      19398        perf-stat.i.context-switches
 7.243e+09 ± 11%     -13.5%  6.262e+09        perf-stat.i.cpu-cycles
      0.14 ± 95%      -0.1        0.01 ± 10%  perf-stat.i.dTLB-load-miss-rate%
   1307140 ± 15%     +17.6%    1537276        perf-stat.i.iTLB-loads
 5.234e+09            -2.9%  5.084e+09        perf-stat.i.instructions
      2655 ±  5%     -10.9%       2366 ±  3%  perf-stat.i.instructions-per-iTLB-miss
     75383 ± 11%     -13.5%      65208        perf-stat.i.metric.GHz
   6029414            -6.2%    5655914        perf-stat.i.node-loads
     20.94 ± 15%      +3.7       24.66 ±  3%  perf-stat.i.node-store-miss-rate%
     82166 ± 23%     +29.0%     106019 ±  2%  perf-stat.i.node-store-misses
   6382540            -9.0%    5805257        perf-stat.i.node-stores
     16.54 ± 24%     -27.2%      12.04        perf-stat.overall.MPKI
      2862 ±  5%     -11.1%       2544 ±  3%  perf-stat.overall.instructions-per-iTLB-miss
      5.63 ± 15%      +1.0        6.67        perf-stat.overall.node-load-miss-rate%
      1.27 ± 23%      +0.5        1.79        perf-stat.overall.node-store-miss-rate%
 1.078e+09            -3.3%  1.043e+09        perf-stat.ps.branch-instructions
  14418791 ± 29%     -30.9%    9965662        perf-stat.ps.branch-misses
  31056696            -9.2%   28199667        perf-stat.ps.cache-misses
  85785810 ± 24%     -29.3%   60689278        perf-stat.ps.cache-references
     20807            -7.6%      19221        perf-stat.ps.context-switches
 7.181e+09 ± 11%     -13.5%  6.209e+09        perf-stat.ps.cpu-cycles
   1296058 ± 15%     +17.6%    1524338        perf-stat.ps.iTLB-loads
 5.189e+09            -2.9%   5.04e+09        perf-stat.ps.instructions
   5972497            -6.2%    5604175        perf-stat.ps.node-loads
     81503 ± 23%     +29.0%     105130 ±  2%  perf-stat.ps.node-store-misses
   6322173            -9.0%    5752078        perf-stat.ps.node-stores
 6.205e+11            -2.6%  6.041e+11        perf-stat.total.instructions
      7.61 ± 14%      -1.6        6.00 ± 13%  perf-profile.calltrace.cycles-pp.filemap_get_pages.filemap_read.aio_read.io_submit_one.__x64_sys_io_submit
      4.09 ± 14%      -0.8        3.27 ± 11%  perf-profile.calltrace.cycles-pp.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64
      4.10 ± 14%      -0.8        3.28 ± 11%  perf-profile.calltrace.cycles-pp.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
      4.10 ± 14%      -0.8        3.28 ± 11%  perf-profile.calltrace.cycles-pp.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
      4.10 ± 14%      -0.8        3.28 ± 11%  perf-profile.calltrace.cycles-pp.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.46 ± 15%      -0.5        2.00 ± 11%  perf-profile.calltrace.cycles-pp.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
      2.35 ± 16%      -0.4        1.90 ± 11%  perf-profile.calltrace.cycles-pp.do_io_getevents.__x64_sys_io_getevents.do_syscall_64.entry_SYSCALL_64_after_hwframe
      1.68 ± 16%      -0.4        1.30 ± 15%  perf-profile.calltrace.cycles-pp.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read.aio_read
      1.75 ± 14%      -0.4        1.38 ± 10%  perf-profile.calltrace.cycles-pp.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64
      1.77 ± 14%      -0.4        1.40 ± 10%  perf-profile.calltrace.cycles-pp.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise.ksys_fadvise64_64.__x64_sys_fadvise64
      0.89 ± 18%      -0.3        0.59 ± 46%  perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.filemap_read.aio_read.io_submit_one
      0.85 ± 18%      -0.3        0.57 ± 46%  perf-profile.calltrace.cycles-pp.ext4_mpage_readpages.read_pages.page_cache_ra_unbounded.filemap_get_pages.filemap_read
      1.49 ± 14%      -0.3        1.22 ± 12%  perf-profile.calltrace.cycles-pp.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read.aio_read
      1.32 ± 13%      -0.2        1.08 ± 12%  perf-profile.calltrace.cycles-pp.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages.filemap_read
      0.98 ± 13%      -0.2        0.76 ± 11%  perf-profile.calltrace.cycles-pp.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec.generic_fadvise
      1.05 ± 12%      -0.2        0.84 ± 14%  perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded.filemap_get_pages
      0.90 ± 10%      -0.2        0.72 ± 14%  perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.folio_alloc.page_cache_ra_unbounded
      0.75 ± 15%      -0.2        0.59 ± 10%  perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page_list.release_pages.__pagevec_release.invalidate_mapping_pagevec
      1.53 ± 17%      +0.4        1.95 ±  9%  perf-profile.calltrace.cycles-pp.schedule.worker_thread.kthread.ret_from_fork
      1.53 ± 17%      +0.4        1.95 ±  9%  perf-profile.calltrace.cycles-pp.__schedule.schedule.worker_thread.kthread.ret_from_fork
      0.00            +1.2        1.17 ± 18%  perf-profile.calltrace.cycles-pp.pagevec_lru_move_fn.folio_mark_accessed.filemap_read.aio_read.io_submit_one
      0.31 ±101%      +2.2        2.47 ± 17%  perf-profile.calltrace.cycles-pp.folio_mark_accessed.filemap_read.aio_read.io_submit_one.__x64_sys_io_submit
      7.61 ± 14%      -1.6        6.00 ± 13%  perf-profile.children.cycles-pp.filemap_get_pages
      4.10 ± 14%      -0.8        3.28 ± 11%  perf-profile.children.cycles-pp.__x64_sys_fadvise64
      4.10 ± 14%      -0.8        3.28 ± 11%  perf-profile.children.cycles-pp.ksys_fadvise64_64
      4.10 ± 14%      -0.8        3.28 ± 11%  perf-profile.children.cycles-pp.generic_fadvise
      4.10 ± 14%      -0.8        3.28 ± 11%  perf-profile.children.cycles-pp.invalidate_mapping_pagevec
      2.47 ± 15%      -0.5        2.00 ± 11%  perf-profile.children.cycles-pp.__x64_sys_io_getevents
      2.36 ± 16%      -0.5        1.90 ± 11%  perf-profile.children.cycles-pp.do_io_getevents
      1.68 ± 16%      -0.4        1.30 ± 15%  perf-profile.children.cycles-pp.read_pages
      1.77 ± 14%      -0.4        1.40 ± 10%  perf-profile.children.cycles-pp.__pagevec_release
      1.49 ± 14%      -0.3        1.22 ± 12%  perf-profile.children.cycles-pp.folio_alloc
      1.40 ± 12%      -0.3        1.14 ± 12%  perf-profile.children.cycles-pp.__alloc_pages
      1.16 ± 15%      -0.3        0.90 ± 13%  perf-profile.children.cycles-pp.lookup_ioctx
      0.90 ± 18%      -0.2        0.67 ± 17%  perf-profile.children.cycles-pp.filemap_get_read_batch
      1.00 ± 12%      -0.2        0.78 ± 12%  perf-profile.children.cycles-pp.free_unref_page_list
      1.08 ± 11%      -0.2        0.86 ± 14%  perf-profile.children.cycles-pp.get_page_from_freelist
      0.85 ± 18%      -0.2        0.65 ± 15%  perf-profile.children.cycles-pp.ext4_mpage_readpages
      0.88 ± 16%      -0.2        0.70 ± 14%  perf-profile.children.cycles-pp.__might_resched
      0.93 ± 10%      -0.2        0.75 ± 14%  perf-profile.children.cycles-pp.rmqueue
      0.78 ± 15%      -0.2        0.61 ± 10%  perf-profile.children.cycles-pp.free_unref_page_commit
      0.61 ± 16%      -0.1        0.48 ± 12%  perf-profile.children.cycles-pp.free_pcppages_bulk
      0.27 ± 15%      -0.1        0.20 ± 11%  perf-profile.children.cycles-pp.hrtimer_next_event_without
      0.16 ± 22%      -0.1        0.11 ± 19%  perf-profile.children.cycles-pp.hrtimer_update_next_event
      0.08 ± 20%      -0.0        0.04 ± 47%  perf-profile.children.cycles-pp.mem_cgroup_charge_statistics
      0.08 ±  9%      -0.0        0.05 ± 47%  perf-profile.children.cycles-pp.tick_program_event
      1.46 ± 13%      +0.3        1.76 ±  8%  perf-profile.children.cycles-pp.load_balance
      0.00            +0.4        0.43 ± 16%  perf-profile.children.cycles-pp.workingset_age_nonresident
      0.00            +0.7        0.65 ± 17%  perf-profile.children.cycles-pp.workingset_activation
      0.00            +0.7        0.67 ± 17%  perf-profile.children.cycles-pp.__folio_activate
      0.00            +1.2        1.18 ± 18%  perf-profile.children.cycles-pp.pagevec_lru_move_fn
      0.57 ± 17%      +1.9        2.51 ± 17%  perf-profile.children.cycles-pp.folio_mark_accessed
      4.33 ± 17%      -0.9        3.45 ± 13%  perf-profile.self.cycles-pp.copy_user_enhanced_fast_string
      1.36 ± 12%      -0.5        0.84 ± 36%  perf-profile.self.cycles-pp.menu_select
      0.64 ± 17%      -0.2        0.45 ± 18%  perf-profile.self.cycles-pp.filemap_get_read_batch
      0.86 ± 16%      -0.2        0.67 ± 13%  perf-profile.self.cycles-pp.__might_resched
      0.46 ± 19%      -0.1        0.32 ± 18%  perf-profile.self.cycles-pp.__get_user_4
      0.34 ± 10%      -0.1        0.24 ±  3%  perf-profile.self.cycles-pp.copy_page_to_iter
      0.14 ± 17%      -0.0        0.09 ± 32%  perf-profile.self.cycles-pp.aio_prep_rw
      0.11 ± 14%      -0.0        0.07 ± 23%  perf-profile.self.cycles-pp._raw_spin_unlock_irqrestore
      0.08 ± 12%      -0.0        0.04 ± 73%  perf-profile.self.cycles-pp.tick_program_event
      0.14 ±  9%      -0.0        0.10 ± 11%  perf-profile.self.cycles-pp.atime_needs_update
      0.00            +0.2        0.22 ± 26%  perf-profile.self.cycles-pp.workingset_activation
      0.00            +0.3        0.29 ± 19%  perf-profile.self.cycles-pp.pagevec_lru_move_fn
      0.00            +0.4        0.35 ± 16%  perf-profile.self.cycles-pp.__folio_activate
      0.00            +0.4        0.43 ± 16%  perf-profile.self.cycles-pp.workingset_age_nonresident




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.18.0-12100-g8b157c14b505" of type "text/plain" (163901 bytes)

View attachment "job-script" of type "text/plain" (8131 bytes)

View attachment "job.yaml" of type "text/plain" (5219 bytes)

View attachment "reproduce" of type "text/plain" (296 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ