lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201226111948.GB11697@xsang-OptiPlex-9020>
Date:   Sat, 26 Dec 2020 19:19:48 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Alex Kogan <alex.kogan@...cle.com>
Cc:     0day robot <lkp@...el.com>,
        Steve Sistare <steven.sistare@...cle.com>,
        Waiman Long <longman@...hat.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        ying.huang@...el.com, feng.tang@...el.com, zhengjun.xing@...el.com,
        linux@...linux.org.uk, peterz@...radead.org, mingo@...hat.com,
        will.deacon@....com, arnd@...db.de, linux-arch@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, tglx@...utronix.de,
        bp@...en8.de, hpa@...or.com, x86@...nel.org, guohanjun@...wei.com,
        jglauber@...vell.com, daniel.m.jordan@...cle.com,
        alex.kogan@...cle.com, dave.dice@...cle.com
Subject: [locking/qspinlock]  0e8d8f4f12:  fsmark.files_per_sec 213.9%
 improvement


Greeting,

FYI, we noticed a 213.9% improvement of fsmark.files_per_sec due to commit:


commit: 0e8d8f4f1214cfbac219d6917b5f6460f818bb7c ("[PATCH v13 3/6] locking/qspinlock: Introduce CNA into the slow path of qspinlock")
url: https://github.com/0day-ci/linux/commits/Alex-Kogan/Add-NUMA-awareness-to-qspinlock/20201223-135025
base: https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git cb262935a166bdef0ccfe6e2adffa00c0f2d038a

in testcase: fsmark
on test machine: 192 threads Intel(R) Xeon(R) Platinum 9242 CPU @ 2.30GHz with 192G memory
with following parameters:

	iterations: 1x
	nr_threads: 64t
	disk: 1BRD_48G
	fs: btrfs
	filesize: 4M
	test_size: 24G
	sync_method: NoSync
	cpufreq_governor: performance
	ucode: 0x5003003

test-description: The fsmark is a file system benchmark to test synchronous write workloads, for example, mail servers workload.
test-url: https://sourceforge.net/projects/fsmark/

In addition to that, the commit also has significant impact on the following tests:

+------------------+------------------------------------------------------------------------+
| testcase: change | reaim: reaim.jobs_per_min 96.1% improvement                            |
| test machine     | 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory |
| test parameters  | cpufreq_governor=performance                                           |
|                  | nr_task=100%                                                           |
|                  | runtime=300s                                                           |
|                  | test=new_fserver                                                       |
|                  | ucode=0x5003003                                                        |
+------------------+------------------------------------------------------------------------+




Details are as below:
-------------------------------------------------------------------------------------------------->


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp install job.yaml  # job file is attached in this email
        bin/lkp run     job.yaml

=========================================================================================
compiler/cpufreq_governor/disk/filesize/fs/iterations/kconfig/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase/ucode:
  gcc-9/performance/1BRD_48G/4M/btrfs/1x/x86_64-rhel-8.3/64t/debian-10.4-x86_64-20200603.cgz/NoSync/lkp-csl-2ap2/24G/fsmark/0x5003003

commit: 
  cb45bab007 ("locking/qspinlock: Refactor the qspinlock slow path")
  0e8d8f4f12 ("locking/qspinlock: Introduce CNA into the slow path of qspinlock")

cb45bab007ff0cfc 0e8d8f4f1214cfbac219d6917b5 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
      1180 ±  2%    +213.9%       3706 ± 11%  fsmark.files_per_sec
    773.00 ±  4%     -30.2%     539.75 ±  7%  fsmark.time.percent_of_cpu_this_job_got
    137.19 ±  3%     -28.5%      98.13 ±  8%  fsmark.time.system_time
     48917 ±  4%      -9.0%      44537 ±  8%  meminfo.AnonHugePages
      4.83 ±  4%      -1.0        3.83 ±  5%  mpstat.cpu.all.sys%
     93.48            +1.1%      94.51        iostat.cpu.idle
      5.76 ±  2%     -16.9%       4.79 ±  6%  iostat.cpu.system
     93.00            +1.1%      94.00        vmstat.cpu.id
     20606 ±  5%     +97.4%      40682 ± 34%  vmstat.system.cs
   4237881 ± 29%    +321.3%   17854499 ±106%  cpuidle.C1.time
     44169 ± 32%    +396.5%     219318 ±113%  cpuidle.C1.usage
     72208 ±  6%    +191.8%     210719 ± 39%  cpuidle.POLL.time
     36708 ±  9%    +306.0%     149034 ± 48%  cpuidle.POLL.usage
      2157 ±  2%     +13.8%       2454 ±  6%  slabinfo.biovec-max.active_objs
      2271 ±  2%     +12.8%       2561 ±  5%  slabinfo.biovec-max.num_objs
      6624           +13.4%       7512 ±  5%  slabinfo.btrfs_delayed_node.active_objs
      6768           +12.1%       7589 ±  5%  slabinfo.btrfs_delayed_node.num_objs
     47815 ± 86%     -89.2%       5149 ±  6%  sched_debug.cfs_rq:/.load.stddev
    142.69 ± 15%     +66.8%     237.99 ± 13%  sched_debug.cfs_rq:/.load_avg.avg
      1598 ± 30%     +92.4%       3075 ± 20%  sched_debug.cfs_rq:/.load_avg.max
    383.38 ± 21%     +90.4%     730.10 ± 13%  sched_debug.cfs_rq:/.load_avg.stddev
     37712 ± 16%     -27.1%      27508 ± 16%  sched_debug.cfs_rq:/.min_vruntime.avg
     58.89 ± 49%     +57.8%      92.96 ±  5%  sched_debug.cfs_rq:/.util_est_enqueued.stddev
      1901 ±  4%      -8.3%       1743 ±  3%  proc-vmstat.nr_active_anon
   5055884            +3.2%    5219972        proc-vmstat.nr_file_pages
   4788961 ±  2%      +3.4%    4953360        proc-vmstat.nr_inactive_file
     36576 ±  2%      -4.1%      35069        proc-vmstat.nr_kernel_stack
      9756            -1.6%       9601        proc-vmstat.nr_mapped
      5205 ±  2%      -5.1%       4939        proc-vmstat.nr_shmem
      1901 ±  4%      -8.3%       1743 ±  3%  proc-vmstat.nr_zone_active_anon
   4788961 ±  2%      +3.4%    4953360        proc-vmstat.nr_zone_inactive_file
      4864 ±  5%      -8.2%       4465        proc-vmstat.pgactivate
 2.507e+09            -8.7%   2.29e+09        perf-stat.i.branch-instructions
  71942409            -5.3%   68125925 ±  2%  perf-stat.i.cache-misses
 4.372e+10           -18.5%  3.565e+10 ±  3%  perf-stat.i.cpu-cycles
 1.231e+10            -7.5%  1.138e+10        perf-stat.i.instructions
      2515 ±  8%     -11.4%       2228 ±  5%  perf-stat.i.instructions-per-iTLB-miss
      0.23           -17.3%       0.19 ±  3%  perf-stat.i.metric.GHz
      0.71 ± 19%     +37.6%       0.97 ± 17%  perf-stat.i.metric.K/sec
     35.43            -6.2%      33.24 ±  2%  perf-stat.i.metric.M/sec
      3.58            -8.5%       3.28        perf-stat.overall.cpi
    612.62            -9.0%     557.36 ±  3%  perf-stat.overall.cycles-between-cache-misses
      2610 ±  6%     -11.9%       2299 ±  4%  perf-stat.overall.instructions-per-iTLB-miss
      0.28            +9.3%       0.31        perf-stat.overall.ipc
  2.39e+09 ±  2%      -6.3%  2.239e+09        perf-stat.ps.branch-instructions
  68534728 ±  2%      -4.6%   65359103 ±  2%  perf-stat.ps.cache-misses
 4.199e+10 ±  2%     -13.3%  3.641e+10        perf-stat.ps.cpu-cycles
 1.173e+10 ±  2%      -5.3%  1.112e+10        perf-stat.ps.instructions
    226946 ± 18%     +44.8%     328685 ± 19%  numa-meminfo.node0.AnonPages
    249710 ± 18%     +41.9%     354296 ± 20%  numa-meminfo.node0.AnonPages.max
    233745 ± 17%     +42.0%     331984 ± 20%  numa-meminfo.node0.Inactive(anon)
      8680 ± 73%    +135.9%      20474 ± 28%  numa-meminfo.node0.PageTables
    250653 ±  2%      +6.8%     267784 ±  5%  numa-meminfo.node0.Unevictable
   2890241 ±  9%     -58.3%    1206212 ± 10%  numa-meminfo.node0.Writeback
    338312 ± 83%    +111.9%     716979 ± 64%  numa-meminfo.node1.Dirty
      8444 ± 82%    +169.8%      22785 ± 37%  numa-meminfo.node2.AnonHugePages
    236489 ± 16%    +142.5%     573382 ± 36%  numa-meminfo.node2.Dirty
   2484617 ± 13%    +117.6%    5406277 ± 50%  numa-meminfo.node2.FilePages
   2459002 ± 19%    +115.1%    5290461 ± 51%  numa-meminfo.node2.Inactive
   2216704 ± 16%    +131.7%    5135596 ± 53%  numa-meminfo.node2.Inactive(file)
    871702 ± 16%     +52.8%    1331684 ±  6%  numa-meminfo.node2.Writeback
      3888 ± 16%     -24.5%       2937 ±  6%  numa-meminfo.node3.Active(anon)
     57796 ± 19%     -49.7%      29078 ± 34%  numa-meminfo.node3.KReclaimable
     57796 ± 19%     -49.7%      29078 ± 34%  numa-meminfo.node3.SReclaimable
    130240 ±  7%     -30.1%      91037 ± 15%  numa-meminfo.node3.Slab
     56616 ± 18%     +45.2%      82199 ± 19%  numa-vmstat.node0.nr_anon_pages
     58314 ± 17%     +42.4%      83024 ± 20%  numa-vmstat.node0.nr_inactive_anon
      2172 ± 73%    +135.7%       5122 ± 28%  numa-vmstat.node0.nr_page_table_pages
     62662 ±  2%      +6.8%      66945 ±  5%  numa-vmstat.node0.nr_unevictable
    738434 ±  9%     -59.3%     300391 ± 10%  numa-vmstat.node0.nr_writeback
     58312 ± 17%     +42.4%      83023 ± 20%  numa-vmstat.node0.nr_zone_inactive_anon
     62662 ±  2%      +6.8%      66945 ±  5%  numa-vmstat.node0.nr_zone_unevictable
    926442 ±  6%     -53.1%     434282 ± 12%  numa-vmstat.node0.nr_zone_write_pending
     84324 ± 81%    +112.0%     178764 ± 64%  numa-vmstat.node1.nr_dirty
    427046 ± 84%     +74.8%     746464 ± 46%  numa-vmstat.node1.nr_zone_write_pending
    627042 ± 18%    +129.0%    1435752 ± 55%  numa-vmstat.node2.nr_dirtied
     59963 ± 18%    +137.8%     142585 ± 36%  numa-vmstat.node2.nr_dirty
    634106 ± 15%    +112.1%    1345185 ± 50%  numa-vmstat.node2.nr_file_pages
    567086 ± 17%    +125.3%    1277507 ± 53%  numa-vmstat.node2.nr_inactive_file
    223407 ± 17%     +48.3%     331299 ±  7%  numa-vmstat.node2.nr_writeback
    567087 ± 17%    +125.3%    1277515 ± 53%  numa-vmstat.node2.nr_zone_inactive_file
    283372 ± 17%     +67.2%     473906 ±  8%  numa-vmstat.node2.nr_zone_write_pending
    922.00 ± 19%     -20.0%     737.25 ±  6%  numa-vmstat.node3.nr_active_anon
     14591 ± 20%     -50.1%       7276 ± 35%  numa-vmstat.node3.nr_slab_reclaimable
    922.00 ± 19%     -20.0%     737.25 ±  6%  numa-vmstat.node3.nr_zone_active_anon
    141466 ± 17%    +523.4%     881964 ± 85%  numa-vmstat.node3.numa_other
    282981 ± 14%     +46.6%     414731 ± 12%  interrupts.CAL:Function_call_interrupts
    770.00 ± 51%     -56.5%     335.00 ± 59%  interrupts.CPU0.NMI:Non-maskable_interrupts
    770.00 ± 51%     -56.5%     335.00 ± 59%  interrupts.CPU0.PMI:Performance_monitoring_interrupts
    346.25 ± 15%     -46.7%     184.50 ± 23%  interrupts.CPU110.NMI:Non-maskable_interrupts
    346.25 ± 15%     -46.7%     184.50 ± 23%  interrupts.CPU110.PMI:Performance_monitoring_interrupts
      1359 ± 14%     +31.7%       1789 ± 17%  interrupts.CPU128.CAL:Function_call_interrupts
      1378 ± 14%   +2294.9%      33020 ±164%  interrupts.CPU153.CAL:Function_call_interrupts
    214.50 ± 34%     +91.7%     411.25 ± 39%  interrupts.CPU16.NMI:Non-maskable_interrupts
    214.50 ± 34%     +91.7%     411.25 ± 39%  interrupts.CPU16.PMI:Performance_monitoring_interrupts
    403.50 ± 35%     -40.3%     241.00 ±  8%  interrupts.CPU169.NMI:Non-maskable_interrupts
    403.50 ± 35%     -40.3%     241.00 ±  8%  interrupts.CPU169.PMI:Performance_monitoring_interrupts
    439.00 ± 50%     -51.5%     212.75 ± 22%  interrupts.CPU170.NMI:Non-maskable_interrupts
    439.00 ± 50%     -51.5%     212.75 ± 22%  interrupts.CPU170.PMI:Performance_monitoring_interrupts
      1062 ±116%     -77.3%     240.75 ± 23%  interrupts.CPU174.NMI:Non-maskable_interrupts
      1062 ±116%     -77.3%     240.75 ± 23%  interrupts.CPU174.PMI:Performance_monitoring_interrupts
    291.25 ± 31%     -34.2%     191.75 ± 30%  interrupts.CPU2.NMI:Non-maskable_interrupts
    291.25 ± 31%     -34.2%     191.75 ± 30%  interrupts.CPU2.PMI:Performance_monitoring_interrupts
    342.00 ± 43%     -58.3%     142.75 ± 16%  interrupts.CPU22.NMI:Non-maskable_interrupts
    342.00 ± 43%     -58.3%     142.75 ± 16%  interrupts.CPU22.PMI:Performance_monitoring_interrupts
    241.25 ± 22%     -42.6%     138.50 ± 22%  interrupts.CPU23.NMI:Non-maskable_interrupts
    241.25 ± 22%     -42.6%     138.50 ± 22%  interrupts.CPU23.PMI:Performance_monitoring_interrupts
      1365 ± 14%     +49.5%       2041 ± 24%  interrupts.CPU35.CAL:Function_call_interrupts
    222.00 ± 94%    +175.1%     610.75 ± 51%  interrupts.CPU63.NMI:Non-maskable_interrupts
    222.00 ± 94%    +175.1%     610.75 ± 51%  interrupts.CPU63.PMI:Performance_monitoring_interrupts
    146.75 ± 39%    +104.6%     300.25 ± 32%  interrupts.CPU64.NMI:Non-maskable_interrupts
    146.75 ± 39%    +104.6%     300.25 ± 32%  interrupts.CPU64.PMI:Performance_monitoring_interrupts
    557.00 ± 88%     -61.3%     215.75 ± 21%  interrupts.CPU74.NMI:Non-maskable_interrupts
    557.00 ± 88%     -61.3%     215.75 ± 21%  interrupts.CPU74.PMI:Performance_monitoring_interrupts
    319.25 ± 22%     -43.9%     179.25 ± 36%  interrupts.CPU75.NMI:Non-maskable_interrupts
    319.25 ± 22%     -43.9%     179.25 ± 36%  interrupts.CPU75.PMI:Performance_monitoring_interrupts
      1343 ± 15%     +23.0%       1652 ±  4%  interrupts.CPU80.CAL:Function_call_interrupts
      1095 ± 66%     -76.5%     257.00 ± 17%  interrupts.CPU96.NMI:Non-maskable_interrupts
      1095 ± 66%     -76.5%     257.00 ± 17%  interrupts.CPU96.PMI:Performance_monitoring_interrupts
    817.25 ± 12%     +62.6%       1329 ± 17%  interrupts.RES:Rescheduling_interrupts
     12.39 ± 23%      -8.2        4.22 ±  6%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe
     12.35 ± 24%      -8.1        4.21 ±  6%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe
     12.12 ± 24%      -8.0        4.08 ±  6%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     12.11 ± 24%      -8.0        4.07 ±  6%  perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
     12.10 ± 24%      -8.0        4.06 ±  6%  perf-profile.calltrace.cycles-pp.btrfs_file_write_iter.new_sync_write.vfs_write.ksys_write.do_syscall_64
     12.07 ± 24%      -8.0        4.04 ±  6%  perf-profile.calltrace.cycles-pp.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write.vfs_write.ksys_write
     12.28 ± 24%      -7.5        4.83 ±  5%  perf-profile.calltrace.cycles-pp.new_sync_write.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe
      6.07 ± 31%      -6.1        0.00        perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata
      6.36 ± 30%      -4.8        1.55 ± 11%  perf-profile.calltrace.cycles-pp.btrfs_delalloc_reserve_metadata.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write.vfs_write
      6.31 ± 30%      -4.8        1.51 ± 12%  perf-profile.calltrace.cycles-pp.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_buffered_write.btrfs_file_write_iter
      6.31 ± 30%      -4.8        1.51 ± 12%  perf-profile.calltrace.cycles-pp.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write
      6.13 ± 31%      -4.8        1.38 ± 14%  perf-profile.calltrace.cycles-pp._raw_spin_lock.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata.btrfs_buffered_write
      4.39 ± 18%      -3.2        1.21 ± 15%  perf-profile.calltrace.cycles-pp.btrfs_inode_rsv_release.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write.vfs_write
      4.38 ± 18%      -3.2        1.20 ± 15%  perf-profile.calltrace.cycles-pp.btrfs_block_rsv_release.btrfs_inode_rsv_release.btrfs_buffered_write.btrfs_file_write_iter.new_sync_write
      4.21 ± 19%      -3.1        1.09 ± 20%  perf-profile.calltrace.cycles-pp._raw_spin_lock.btrfs_block_rsv_release.btrfs_inode_rsv_release.btrfs_buffered_write.btrfs_file_write_iter
      0.13 ±173%      +0.5        0.63 ±  6%  perf-profile.calltrace.cycles-pp.tick_nohz_irq_exit.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      4.51 ±  6%      +0.5        5.04 ±  3%  perf-profile.calltrace.cycles-pp.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      4.48 ±  5%      +0.5        5.02 ±  3%  perf-profile.calltrace.cycles-pp.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      4.37 ±  6%      +0.6        4.93 ±  3%  perf-profile.calltrace.cycles-pp.hrtimer_interrupt.__sysvec_apic_timer_interrupt.asm_call_sysvec_on_stack.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.18 ±173%      +0.6        0.75 ± 27%  perf-profile.calltrace.cycles-pp.console_unlock.vprintk_emit.devkmsg_emit.devkmsg_write.cold.new_sync_write
      0.18 ±173%      +0.6        0.76 ± 27%  perf-profile.calltrace.cycles-pp.vprintk_emit.devkmsg_emit.devkmsg_write.cold.new_sync_write.vfs_write
      0.18 ±173%      +0.6        0.76 ± 27%  perf-profile.calltrace.cycles-pp.devkmsg_write.cold.new_sync_write.vfs_write.ksys_write.do_syscall_64
      0.18 ±173%      +0.6        0.76 ± 27%  perf-profile.calltrace.cycles-pp.devkmsg_emit.devkmsg_write.cold.new_sync_write.vfs_write.ksys_write
      0.18 ±173%      +0.6        0.77 ± 27%  perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.write
      0.18 ±173%      +0.6        0.77 ± 27%  perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.18 ±173%      +0.6        0.77 ± 27%  perf-profile.calltrace.cycles-pp.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.18 ±173%      +0.6        0.77 ± 27%  perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
      0.18 ±173%      +0.6        0.77 ± 27%  perf-profile.calltrace.cycles-pp.write
      0.00            +0.6        0.62 ±  7%  perf-profile.calltrace.cycles-pp.ktime_get.tick_nohz_irq_exit.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.00            +1.1        1.06 ± 21%  perf-profile.calltrace.cycles-pp.__cna_queued_spin_lock_slowpath._raw_spin_lock.btrfs_block_rsv_release.btrfs_inode_rsv_release.btrfs_buffered_write
      7.48 ±  5%      +1.2        8.68 ±  2%  perf-profile.calltrace.cycles-pp.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle
      0.00            +1.3        1.33 ± 15%  perf-profile.calltrace.cycles-pp.__cna_queued_spin_lock_slowpath._raw_spin_lock.__reserve_bytes.btrfs_reserve_metadata_bytes.btrfs_delalloc_reserve_metadata
     11.79 ±  8%      +2.8       14.63 ±  8%  perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter.do_idle.cpu_startup_entry
      9.11 ± 14%      +4.4       13.51 ± 18%  perf-profile.calltrace.cycles-pp.menu_select.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify
     10.58 ± 26%     -10.6        0.00        perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath
     12.10 ± 24%      -8.0        4.06 ±  6%  perf-profile.children.cycles-pp.btrfs_file_write_iter
     12.07 ± 24%      -8.0        4.04 ±  6%  perf-profile.children.cycles-pp.btrfs_buffered_write
     14.00 ± 18%      -7.9        6.09 ±  7%  perf-profile.children.cycles-pp._raw_spin_lock
     12.64 ± 23%      -7.8        4.85 ±  5%  perf-profile.children.cycles-pp.ksys_write
     12.62 ± 24%      -7.8        4.83 ±  5%  perf-profile.children.cycles-pp.new_sync_write
     12.63 ± 23%      -7.8        4.85 ±  5%  perf-profile.children.cycles-pp.vfs_write
     13.17 ± 23%      -7.6        5.60 ±  6%  perf-profile.children.cycles-pp.do_syscall_64
     13.24 ± 23%      -7.3        5.92 ±  6%  perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe
      6.38 ± 30%      -4.8        1.55 ± 12%  perf-profile.children.cycles-pp.__reserve_bytes
      6.33 ± 30%      -4.8        1.52 ± 12%  perf-profile.children.cycles-pp.btrfs_reserve_metadata_bytes
      6.36 ± 30%      -4.8        1.55 ± 11%  perf-profile.children.cycles-pp.btrfs_delalloc_reserve_metadata
      4.49 ± 18%      -3.3        1.23 ± 15%  perf-profile.children.cycles-pp.btrfs_block_rsv_release
      4.39 ± 18%      -3.2        1.21 ± 15%  perf-profile.children.cycles-pp.btrfs_inode_rsv_release
      0.16 ± 77%      -0.1        0.06 ±  7%  perf-profile.children.cycles-pp.do_filp_open
      0.16 ± 77%      -0.1        0.06 ±  7%  perf-profile.children.cycles-pp.path_openat
      0.16 ± 75%      -0.1        0.06 ±  6%  perf-profile.children.cycles-pp.do_sys_open
      0.16 ± 75%      -0.1        0.06 ±  6%  perf-profile.children.cycles-pp.do_sys_openat2
      0.21 ± 32%      -0.1        0.13 ± 29%  perf-profile.children.cycles-pp.rcu_sched_clock_irq
      0.22 ±  9%      -0.1        0.16 ±  6%  perf-profile.children.cycles-pp.update_blocked_averages
      0.10 ± 29%      -0.0        0.05 ± 70%  perf-profile.children.cycles-pp.update_ts_time_stats
      0.07 ± 12%      -0.0        0.04 ± 58%  perf-profile.children.cycles-pp.__pagevec_lru_add_fn
      0.16 ±  9%      +0.0        0.20 ± 12%  perf-profile.children.cycles-pp.brd_lookup_page
      0.13 ± 11%      +0.0        0.17 ± 18%  perf-profile.children.cycles-pp.__radix_tree_lookup
      0.14 ±  7%      +0.1        0.20 ±  9%  perf-profile.children.cycles-pp.__intel_pmu_enable_all
      0.48 ±  8%      +0.1        0.62 ±  8%  perf-profile.children.cycles-pp._raw_spin_lock_irqsave
      0.51 ± 29%      +0.3        0.77 ± 27%  perf-profile.children.cycles-pp.write
      0.39 ± 32%      +0.3        0.65 ±  6%  perf-profile.children.cycles-pp.tick_nohz_irq_exit
      0.50 ± 30%      +0.3        0.76 ± 27%  perf-profile.children.cycles-pp.devkmsg_write.cold
      0.50 ± 30%      +0.3        0.76 ± 27%  perf-profile.children.cycles-pp.devkmsg_emit
      0.03 ±173%      +0.3        0.29 ±108%  perf-profile.children.cycles-pp.osq_lock
      0.03 ±173%      +0.3        0.29 ±107%  perf-profile.children.cycles-pp.__mutex_lock
      0.11 ±130%      +0.3        0.45 ± 68%  perf-profile.children.cycles-pp.__do_sys_finit_module
      0.11 ±130%      +0.3        0.45 ± 68%  perf-profile.children.cycles-pp.load_module
      0.11 ±130%      +0.3        0.45 ± 67%  perf-profile.children.cycles-pp.syscall
      0.14 ± 88%      +0.4        0.54 ± 53%  perf-profile.children.cycles-pp.wb_workfn
      0.14 ± 88%      +0.4        0.54 ± 53%  perf-profile.children.cycles-pp.wb_writeback
      0.14 ± 88%      +0.4        0.54 ± 53%  perf-profile.children.cycles-pp.writeback_sb_inodes
      0.12 ±105%      +0.4        0.53 ± 54%  perf-profile.children.cycles-pp.__writeback_single_inode
      4.76 ±  5%      +0.5        5.21 ±  3%  perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt
      4.65 ±  5%      +0.5        5.13 ±  2%  perf-profile.children.cycles-pp.hrtimer_interrupt
      6.15 ±  5%      +0.5        6.65 ±  2%  perf-profile.children.cycles-pp.asm_call_sysvec_on_stack
      2.06 ± 28%      +1.0        3.05 ±  9%  perf-profile.children.cycles-pp.ktime_get
      7.80 ±  4%      +1.1        8.94 ±  2%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
     10.10 ±  6%      +2.0       12.05 ±  6%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.00            +2.6        2.56 ± 19%  perf-profile.children.cycles-pp.__cna_queued_spin_lock_slowpath
      9.16 ± 14%      +4.4       13.57 ± 18%  perf-profile.children.cycles-pp.menu_select
     10.50 ± 26%     -10.5        0.00        perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath
      0.18 ± 30%      -0.1        0.11 ± 27%  perf-profile.self.cycles-pp.rcu_sched_clock_irq
      0.10 ± 11%      +0.0        0.13 ± 10%  perf-profile.self.cycles-pp.__extent_writepage
      0.14 ±  7%      +0.1        0.20 ±  9%  perf-profile.self.cycles-pp.__intel_pmu_enable_all
      0.08 ± 10%      +0.1        0.16 ± 22%  perf-profile.self.cycles-pp.end_page_writeback
      0.03 ±173%      +0.2        0.28 ±105%  perf-profile.self.cycles-pp.osq_lock
      1.77 ± 32%      +1.0        2.75 ± 10%  perf-profile.self.cycles-pp.ktime_get
      0.00            +2.5        2.51 ± 19%  perf-profile.self.cycles-pp.__cna_queued_spin_lock_slowpath
      7.68 ± 15%      +4.1       11.79 ± 21%  perf-profile.self.cycles-pp.menu_select


                                                                                
                               fsmark.files_per_sec                             
                                                                                
  4500 +--------------------------------------------------------------------+   
       |    O  O  O    O       O  O                                         |   
  4000 |-+O         O                                                       |   
  3500 |-+                O                    O    O    O     O O          |   
       |                    O             O O    O     O                    |   
  3000 |-+                             O                                    |   
  2500 |-+                                                                  |   
       |                                                                    |   
  2000 |-+                                                                  |   
  1500 |-+                                                                  |   
       |.. .+..+..+.+..+..+.+..+..+.+..+..+.+..+.+..+..+.+..+..+.+..+..+.+..|   
  1000 |-++                                                                 |   
   500 |-+                                                                  |   
       |                                                                    |   
     0 +--------------------------------------------------------------------+   
                                                                                
                                                                                
[*] bisect-good sample
[O] bisect-bad  sample

***************************************************************************************************
lkp-cpl-4sp1: 144 threads Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Oliver Sang


View attachment "config-5.10.0-rc6-00085-g0e8d8f4f1214" of type "text/plain" (171080 bytes)

View attachment "job-script" of type "text/plain" (8165 bytes)

View attachment "job.yaml" of type "text/plain" (5639 bytes)

View attachment "reproduce" of type "text/plain" (1433 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ