lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <202309182202.4bab967a-oliver.sang@intel.com>
Date:   Mon, 18 Sep 2023 23:07:45 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Tejun Heo <tj@...nel.org>
CC:     <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
        <linux-kernel@...r.kernel.org>,
        Lai Jiangshan <jiangshanlai@...il.com>, <ying.huang@...el.com>,
        <feng.tang@...el.com>, <fengwei.yin@...el.com>,
        <oliver.sang@...el.com>
Subject: [linus:master] [workqueue]  fe089f87cc:
 fxmark.ssd_xfs_dbench_client_72_bufferedio.works/sec -32.0% regression



Hello,

kernel test robot noticed a -32.0% regression of fxmark.ssd_xfs_dbench_client_72_bufferedio.works/sec on:


commit: fe089f87cccb066e8ad20f49ddf05e95adc1fa8d ("workqueue: Not all work insertion needs to wake up a worker")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

testcase: fxmark
test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
parameters:

	disk: 1SSD
	media: ssd
	test: dbench_client
	fstype: xfs
	directio: bufferedio
	cpufreq_governor: performance




If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202309182202.4bab967a-oliver.sang@intel.com


Details are as below:
-------------------------------------------------------------------------------------------------->


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230918/202309182202.4bab967a-oliver.sang@intel.com

=========================================================================================
compiler/cpufreq_governor/directio/disk/fstype/kconfig/media/rootfs/tbox_group/test/testcase:
  gcc-12/performance/bufferedio/1SSD/xfs/x86_64-rhel-8.3/ssd/debian-11.1-x86_64-20220510.cgz/lkp-icl-2sp5/dbench_client/fxmark

commit: 
  c0ab017d43 ("workqueue: Cleanups around process_scheduled_works()")
  fe089f87cc ("workqueue: Not all work insertion needs to wake up a worker")

c0ab017d43f4c414 fe089f87cccb066e8ad20f49ddf 
---------------- --------------------------- 
         %stddev     %change         %stddev
             \          |                \  
  39209971           -10.4%   35133502 ±  2%  cpuidle..usage
     52.91            +5.1%      55.59        iostat.cpu.idle
     22.75            -9.7%      20.56 ±  2%  iostat.cpu.iowait
  10466069 ±  2%     -29.5%    7376773 ± 11%  numa-numastat.node1.local_node
  10510900 ±  2%     -28.8%    7480808 ± 11%  numa-numastat.node1.numa_hit
      1179 ± 15%     -22.0%     919.83 ±  5%  perf-c2c.DRAM.remote
      1145 ± 14%     -29.3%     809.67 ±  9%  perf-c2c.HITM.local
     46807 ± 21%     -45.8%      25346 ± 50%  numa-meminfo.node0.Active(anon)
    504904 ± 25%     -45.3%     276367 ± 47%  numa-meminfo.node0.Shmem
     66660 ±  8%     -33.3%      44451 ± 18%  numa-meminfo.node1.Dirty
   -828.44           -61.2%    -321.18        sched_debug.cpu.nr_uninterruptible.avg
     -5154           -58.8%      -2121        sched_debug.cpu.nr_uninterruptible.min
    892.09 ±  6%     -49.8%     447.61 ± 21%  sched_debug.cpu.nr_uninterruptible.stddev
    142456 ±  2%      -7.6%     131656 ±  3%  vmstat.io.bo
    124966 ±  2%     -20.0%      99939 ±  4%  vmstat.system.cs
     51184            -8.8%      46683 ±  2%  vmstat.system.in
    124596 ±  2%     -19.9%      99792 ±  4%  perf-stat.i.context-switches
      3454 ±  2%     -31.2%       2377 ± 10%  perf-stat.i.cpu-migrations
    124479 ±  2%     -19.9%      99650 ±  4%  perf-stat.ps.context-switches
      3453 ±  2%     -31.2%       2374 ± 10%  perf-stat.ps.cpu-migrations
    131582 ± 15%     -23.7%     100437 ± 14%  meminfo.Active
     59138 ± 34%     -52.4%      28159 ± 49%  meminfo.Active(anon)
   1088512 ± 22%     -30.6%     755813 ± 19%  meminfo.Committed_AS
    380805           -11.5%     337183 ±  3%  meminfo.Dirty
   1235425 ± 18%     -24.4%     933365 ± 14%  meminfo.Inactive
    902090 ± 24%     -33.8%     596946 ± 21%  meminfo.Inactive(anon)
    642365 ± 37%     -52.0%     308283 ± 46%  meminfo.Shmem
     11701 ± 21%     -45.9%       6336 ± 50%  numa-vmstat.node0.nr_active_anon
    126226 ± 25%     -45.3%      69093 ± 47%  numa-vmstat.node0.nr_shmem
     11701 ± 21%     -45.9%       6336 ± 50%  numa-vmstat.node0.nr_zone_active_anon
  10617306 ±  2%     -29.5%    7484793 ± 12%  numa-vmstat.node1.nr_dirtied
     16646 ±  8%     -33.2%      11111 ± 18%  numa-vmstat.node1.nr_dirty
   2570325 ±  2%     -28.0%    1849961 ± 12%  numa-vmstat.node1.nr_written
     16619 ±  8%     -33.2%      11093 ± 18%  numa-vmstat.node1.nr_zone_write_pending
  10510971 ±  2%     -28.8%    7480700 ± 11%  numa-vmstat.node1.numa_hit
  10466140 ±  2%     -29.5%    7376665 ± 11%  numa-vmstat.node1.numa_local
     14789 ± 34%     -52.4%       7043 ± 48%  proc-vmstat.nr_active_anon
  73542893           -12.1%   64665529 ±  4%  proc-vmstat.nr_dirtied
     95176           -11.2%      84481 ±  3%  proc-vmstat.nr_dirty
    949397 ±  6%      -8.7%     866798 ±  4%  proc-vmstat.nr_file_pages
    225533 ± 24%     -33.8%     149239 ± 21%  proc-vmstat.nr_inactive_anon
    160600 ± 37%     -52.0%      77072 ± 46%  proc-vmstat.nr_shmem
  17820341           -10.9%   15872978 ±  3%  proc-vmstat.nr_written
     14789 ± 34%     -52.4%       7043 ± 48%  proc-vmstat.nr_zone_active_anon
    225533 ± 24%     -33.8%     149239 ± 21%  proc-vmstat.nr_zone_inactive_anon
     94869           -11.3%      84180 ±  3%  proc-vmstat.nr_zone_write_pending
  72272965           -11.9%   63654005 ±  4%  proc-vmstat.numa_hit
  72137253           -11.9%   63522521 ±  4%  proc-vmstat.numa_local
   1532444 ±  4%     -15.8%    1290917 ±  4%  proc-vmstat.pgactivate
  72444739           -11.9%   63805246 ±  4%  proc-vmstat.pgalloc_normal
    170396 ± 37%     -52.2%      81460 ± 46%  proc-vmstat.pgdeactivate
  72261159           -11.8%   63709587 ±  4%  proc-vmstat.pgfree
  84054518            -7.2%   77981052 ±  2%  proc-vmstat.pgpgout
      2064           +20.2%       2480 ±  7%  fxmark.ssd_xfs_dbench_client_54_bufferedio.idle_sec
     63.92           +19.5%      76.41 ±  7%  fxmark.ssd_xfs_dbench_client_54_bufferedio.idle_util
    743.56 ±  2%     -42.5%     427.65 ± 33%  fxmark.ssd_xfs_dbench_client_54_bufferedio.iowait_sec
     23.03 ±  2%     -42.8%      13.18 ± 33%  fxmark.ssd_xfs_dbench_client_54_bufferedio.iowait_util
     41.98           +32.2%      55.50 ± 12%  fxmark.ssd_xfs_dbench_client_54_bufferedio.irq_sec
      1.30           +31.5%       1.71 ± 12%  fxmark.ssd_xfs_dbench_client_54_bufferedio.irq_util
     10.09           -17.7%       8.31 ± 11%  fxmark.ssd_xfs_dbench_client_54_bufferedio.softirq_sec
      0.31           -18.1%       0.26 ± 11%  fxmark.ssd_xfs_dbench_client_54_bufferedio.softirq_util
    297.40           -26.7%     218.07 ± 19%  fxmark.ssd_xfs_dbench_client_54_bufferedio.sys_sec
      9.21           -27.0%       6.72 ± 19%  fxmark.ssd_xfs_dbench_client_54_bufferedio.sys_util
     71.92           -22.2%      55.94 ± 15%  fxmark.ssd_xfs_dbench_client_54_bufferedio.user_sec
      2.23           -22.6%       1.72 ± 15%  fxmark.ssd_xfs_dbench_client_54_bufferedio.user_util
      5264           -28.2%       3777 ± 19%  fxmark.ssd_xfs_dbench_client_54_bufferedio.works/sec
      2971           +19.2%       3543        fxmark.ssd_xfs_dbench_client_72_bufferedio.idle_sec
     68.77           +18.6%      81.58        fxmark.ssd_xfs_dbench_client_72_bufferedio.idle_util
    910.14 ±  6%     -51.0%     446.38 ± 11%  fxmark.ssd_xfs_dbench_client_72_bufferedio.iowait_sec
     21.06 ±  6%     -51.2%      10.28 ± 11%  fxmark.ssd_xfs_dbench_client_72_bufferedio.iowait_util
     76.95           +31.1%     100.90 ±  2%  fxmark.ssd_xfs_dbench_client_72_bufferedio.irq_sec
      1.78           +30.4%       2.32 ±  2%  fxmark.ssd_xfs_dbench_client_72_bufferedio.irq_util
     11.05           -15.6%       9.33 ±  2%  fxmark.ssd_xfs_dbench_client_72_bufferedio.softirq_sec
      0.26           -16.0%       0.21 ±  2%  fxmark.ssd_xfs_dbench_client_72_bufferedio.softirq_util
    281.46 ±  2%     -31.1%     193.89 ±  5%  fxmark.ssd_xfs_dbench_client_72_bufferedio.sys_sec
      6.51 ±  2%     -31.5%       4.46 ±  5%  fxmark.ssd_xfs_dbench_client_72_bufferedio.sys_util
     70.03 ±  3%     -29.2%      49.56 ±  5%  fxmark.ssd_xfs_dbench_client_72_bufferedio.user_sec
      1.62 ±  3%     -29.6%       1.14 ±  5%  fxmark.ssd_xfs_dbench_client_72_bufferedio.user_util
      4529 ±  2%     -32.0%       3080 ±  5%  fxmark.ssd_xfs_dbench_client_72_bufferedio.works/sec
 5.884e+08           -12.1%  5.173e+08 ±  4%  fxmark.time.file_system_outputs
   8226682           -26.2%    6068375 ±  7%  fxmark.time.involuntary_context_switches
    204845            -1.2%     202350        fxmark.time.minor_page_faults
    209.83           -16.0%     176.17 ±  5%  fxmark.time.percent_of_cpu_this_job_got
    943.47           -17.0%     783.39 ±  5%  fxmark.time.system_time
    299.18           -11.7%     264.25 ±  3%  fxmark.time.user_time
  18576211           -16.6%   15496778 ±  5%  fxmark.time.voluntary_context_switches
      0.01 ± 41%     -90.9%       0.00 ±223%  perf-sched.sch_delay.avg.ms.__cond_resched.__mutex_lock.constprop.0.__flush_workqueue
      0.01 ± 89%    +490.2%       0.05 ± 42%  perf-sched.sch_delay.avg.ms.__cond_resched.down.xlog_write_iclog.xlog_state_release_iclog.xlog_force_lsn
      0.03 ± 54%     +97.5%       0.05 ± 30%  perf-sched.sch_delay.avg.ms.__cond_resched.down_read.xfs_can_free_eofblocks.xfs_release.__fput
      0.01 ± 30%     -93.2%       0.00 ±223%  perf-sched.sch_delay.max.ms.__cond_resched.__mutex_lock.constprop.0.__flush_workqueue
      0.41 ± 18%     -49.0%       0.21 ± 56%  perf-sched.sch_delay.max.ms.__cond_resched.down_read.xlog_cil_commit.__xfs_trans_commit.xfs_vn_update_time
    770542 ± 44%     -58.4%     320650 ± 55%  perf-sched.total_wait_and_delay.count.ms
      0.46 ± 59%    -100.0%       0.00        perf-sched.wait_and_delay.avg.ms.__cond_resched.dput.do_unlinkat.__x64_sys_unlink.do_syscall_64
    555.61 ± 19%     +28.0%     711.18 ±  8%  perf-sched.wait_and_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
    177.28 ±  8%     +18.7%     210.40 ±  7%  perf-sched.wait_and_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      0.67 ± 12%     +38.1%       0.93 ± 23%  perf-sched.wait_and_delay.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
     52.59 ± 71%     -98.7%       0.70 ±223%  perf-sched.wait_and_delay.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_killable.__kthread_create_on_node
      1.80 ± 28%    +194.2%       5.28 ± 31%  perf-sched.wait_and_delay.avg.ms.worker_thread.kthread.ret_from_fork
      0.67 ± 13%     -43.0%       0.38 ± 56%  perf-sched.wait_and_delay.avg.ms.xlog_force_lsn.xfs_log_force_seq.xfs_file_fsync.__x64_sys_fsync
      4396 ± 49%    -100.0%       0.00        perf-sched.wait_and_delay.count.__cond_resched.dput.do_unlinkat.__x64_sys_unlink.do_syscall_64
     18311 ± 52%     -68.1%       5841 ± 62%  perf-sched.wait_and_delay.count.__cond_resched.dput.path_put.user_statfs.__do_sys_statfs
      5076 ± 44%     -83.9%     817.33 ±162%  perf-sched.wait_and_delay.count.__cond_resched.iomap_write_iter.iomap_file_buffered_write.xfs_file_buffered_write.vfs_write
    120.67 ±  3%     -37.6%      75.33 ± 71%  perf-sched.wait_and_delay.count.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64
    262.17 ± 39%     -51.7%     126.50 ± 51%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll
      8.42 ± 33%    -100.0%       0.00        perf-sched.wait_and_delay.max.ms.__cond_resched.dput.do_unlinkat.__x64_sys_unlink.do_syscall_64
     10.54 ± 10%     +96.1%      20.67 ± 83%  perf-sched.wait_and_delay.max.ms.__cond_resched.dput.path_put.user_statfs.__do_sys_statfs
    554.95 ± 87%     -92.9%      39.45 ±223%  perf-sched.wait_and_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_killable.__kthread_create_on_node
      0.43 ± 41%     -98.3%       0.01 ±223%  perf-sched.wait_time.avg.ms.__cond_resched.__mutex_lock.constprop.0.__flush_workqueue
    555.59 ± 19%     +28.0%     711.16 ±  8%  perf-sched.wait_time.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep
    177.24 ±  8%     +18.7%     210.35 ±  7%  perf-sched.wait_time.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64
      0.33 ± 41%    +293.7%       1.28 ± 39%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_common_interrupt
      0.56 ± 46%    +195.7%       1.65 ± 57%  perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_call_function_single
      0.66 ± 13%     +37.6%       0.91 ± 22%  perf-sched.wait_time.avg.ms.io_schedule.folio_wait_bit_common.folio_wait_writeback.__filemap_fdatawait_range
     52.58 ± 71%     -95.0%       2.61 ± 91%  perf-sched.wait_time.avg.ms.schedule_timeout.__wait_for_common.wait_for_completion_killable.__kthread_create_on_node
      1.79 ± 28%    +194.6%       5.27 ± 31%  perf-sched.wait_time.avg.ms.worker_thread.kthread.ret_from_fork
      7.29 ± 25%     +39.4%      10.17 ± 12%  perf-sched.wait_time.max.ms.__cond_resched.__alloc_pages.folio_alloc.__filemap_get_folio.iomap_write_begin
      1.32 ± 87%     -81.7%       0.24 ± 80%  perf-sched.wait_time.max.ms.__cond_resched.__filemap_fdatawait_range.file_write_and_wait_range.xfs_file_fsync.__x64_sys_fsync
      5.99 ± 31%     +69.4%      10.14 ± 10%  perf-sched.wait_time.max.ms.__cond_resched.__kmem_cache_alloc_node.__kmalloc.kmem_alloc.xfs_dir_lookup
      0.73 ± 66%     -99.0%       0.01 ±223%  perf-sched.wait_time.max.ms.__cond_resched.__mutex_lock.constprop.0.__flush_workqueue
      8.41 ±  9%     +36.1%      11.44 ±  7%  perf-sched.wait_time.max.ms.__cond_resched.down.xfs_buf_lock.xfs_buf_find_lock.xfs_buf_lookup
      3.42 ± 46%     -72.2%       0.95 ± 80%  perf-sched.wait_time.max.ms.__cond_resched.down_read.xfs_file_fsync.__x64_sys_fsync.do_syscall_64
      6.45 ± 39%     +55.2%      10.00 ± 10%  perf-sched.wait_time.max.ms.__cond_resched.down_read.xfs_ilock.xfs_file_buffered_read.xfs_file_read_iter
      8.75 ± 20%     +37.4%      12.02 ±  8%  perf-sched.wait_time.max.ms.__cond_resched.down_read.xfs_ilock_data_map_shared.xfs_dir_open.do_dentry_open
      7.50 ± 11%     +39.9%      10.49 ± 11%  perf-sched.wait_time.max.ms.__cond_resched.down_read.xlog_cil_commit.__xfs_trans_commit.xfs_create
      3.28 ± 79%     -67.8%       1.06 ± 29%  perf-sched.wait_time.max.ms.__cond_resched.down_read.xlog_cil_commit.__xfs_trans_commit.xfs_iomap_write_unwritten
      6.70 ± 28%     +56.6%      10.49 ±  9%  perf-sched.wait_time.max.ms.__cond_resched.down_write.xfs_ilock.xfs_ilock_for_iomap.xfs_buffered_write_iomap_begin
      8.38 ± 34%     +56.7%      13.13 ± 19%  perf-sched.wait_time.max.ms.__cond_resched.dput.do_unlinkat.__x64_sys_unlink.do_syscall_64
      9.59 ± 14%     +33.7%      12.82 ±  4%  perf-sched.wait_time.max.ms.__cond_resched.dput.open_last_lookups.path_openat.do_filp_open
      9.16 ± 23%     +29.4%      11.86 ±  8%  perf-sched.wait_time.max.ms.__cond_resched.dput.path_put.vfs_statx.vfs_fstatat
      8.25 ± 25%     +41.5%      11.68 ±  7%  perf-sched.wait_time.max.ms.__cond_resched.dput.terminate_walk.path_openat.do_filp_open
      7.87 ± 28%     +45.3%      11.43 ±  9%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.alloc_empty_file.path_openat.do_filp_open
      8.36 ± 28%     +43.2%      11.97 ± 15%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.getname_flags.part.0
      8.65 ± 11%     +37.6%      11.90 ±  2%  perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc.security_file_alloc.init_file.alloc_empty_file
      7.41 ± 32%     +45.8%      10.81 ± 13%  perf-sched.wait_time.max.ms.__cond_resched.truncate_inode_pages_range.evict.do_unlinkat.__x64_sys_unlink




Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ