lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Nov 2017 15:22:33 +0800
From:   Aaron Lu <aaron.lu@...el.com>
To:     Benjamin Coddington <bcodding@...hat.com>
Cc:     lkp@...org, Linus Torvalds <torvalds@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Jeff Layton <jlayton@...hat.com>, xiaolong.ye@...el.com
Subject: Re: [LKP] [lkp-robot] [fs/locks] 52306e882f:
 stress-ng.lockofd.ops_per_sec -11% regression

On Thu, Sep 28, 2017 at 04:02:23PM +0800, kernel test robot wrote:
> 
> Greeting,
> 
> FYI, we noticed a -11% regression of stress-ng.lockofd.ops_per_sec due to commit:
> 
> 
> commit: 52306e882f77d3fd73f91435c41373d634acc5d2 ("fs/locks: Use allocation rather than the stack in fcntl_getlk()")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

It's been a while, I wonder what do you think of this regression?

The test stresses byte-range locks AFAICS and since the commit uses
dynamic allocation instead of stack for the 'struct file_lock', it sounds
natural the performance regressed for this test.

Now the question is, do we care about the performance regression here?
Feel free to let me know if you need any other data.

Thanks for your time.

Regards,
Aaron

> in testcase: stress-ng
> on test machine: 88 threads Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz with 128G memory
> with following parameters:
> 
> 	testtime: 1s
> 	class: filesystem
> 	cpufreq_governor: performance
> 
> 
> 
> 
> Details are as below:
> -------------------------------------------------------------------------------------------------->
> 
> 
> To reproduce:
> 
>         git clone https://github.com/intel/lkp-tests.git
>         cd lkp-tests
>         bin/lkp install job.yaml  # job file is attached in this email
>         bin/lkp run     job.yaml
> 
> testcase/path_params/tbox_group/run: stress-ng/1s-filesystem-performance/lkp-bdw-ep6
> 
>        v4.13-rc1  52306e882f77d3fd73f91435c4  
> ----------------  --------------------------  
>          %stddev      change         %stddev
>              \          |                \  
>  1.219e+08             -11%   1.09e+08        stress-ng.lockofd.ops_per_sec
>  1.229e+08             -10%  1.103e+08        stress-ng.locka.ops_per_sec
>  1.233e+08             -10%  1.105e+08        stress-ng.locka.ops
>  1.223e+08             -11%  1.093e+08        stress-ng.lockofd.ops
>    1061237              10%    1168476        stress-ng.eventfd.ops
>    1061205              10%    1168414        stress-ng.eventfd.ops_per_sec
>    2913174               9%    3163165        stress-ng.time.voluntary_context_switches
>      89.90              -4%      86.58        stress-ng.time.user_time
>      26510              -6%      24822        stress-ng.io.ops
>      26489              -6%      24798        stress-ng.io.ops_per_sec
>     885499 ± 14%        18%    1042236        perf-stat.cpu-migrations
>  2.537e+08              10%  2.783e+08        perf-stat.node-store-misses
>    1067830 ±  4%         8%    1154877 ±  3%  perf-stat.page-faults
>    5384755 ±  4%         7%    5747689        perf-stat.context-switches
>      32.28               7%      34.42 ±  3%  perf-stat.node-store-miss-rate%
>      12245 ±110%     -7e+03       5367 ± 29%  latency_stats.avg.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
>     311261 ±173%     -3e+05      11702 ±100%  latency_stats.avg.tty_release_struct.tty_release.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>       1472 ± 60%      4e+03       5144 ± 97%  latency_stats.max.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
>        225 ± 39%      3e+03       3698 ±132%  latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.____fput.task_work_run
>        228 ± 34%      3e+03       3103 ±159%  latency_stats.max.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
>        270 ± 24%      3e+03       3110 ±162%  latency_stats.max.io_schedule.wait_on_page_bit_common.__filemap_fdatawait_range.filemap_write_and_wait_range.nfs_file_fsync.vfs_fsync_range.vfs_fsync.nfs4_file_flush.[nfsv4].filp_close.do_dup2.SyS_dup2.entry_SYSCALL_64_fastpath
>      12245 ±110%     -7e+03       5367 ± 29%  latency_stats.max.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
>     927506 ±173%     -9e+05      11702 ±100%  latency_stats.max.tty_release_struct.tty_release.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
>       7892 ± 54%      3e+04      33793 ±131%  latency_stats.sum.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
>      12030 ±109%      2e+04      33536 ±136%  latency_stats.sum.autofs4_wait.autofs4_mount_wait.autofs4_d_manage.follow_managed.lookup_fast.path_openat.do_filp_open.do_sys_open.SyS_open.entry_SYSCALL_64_fastpath
>      14311 ± 15%      7e+03      21729 ±116%  latency_stats.sum.call_rwsem_down_read_failed.page_lock_anon_vma_read.rmap_walk_anon.rmap_walk.try_to_unmap.migrate_pages.migrate_misplaced_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>       8095 ± 22%      7e+03      15421 ± 79%  latency_stats.sum.call_rwsem_down_read_failed.rmap_walk_anon.rmap_walk.remove_migration_ptes.migrate_pages.migrate_misplaced_page.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>        495 ± 39%      7e+03       7470 ±131%  latency_stats.sum.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_do_close.[nfsv4].__nfs4_close.[nfsv4].nfs4_close_sync.[nfsv4].nfs4_close_context.[nfsv4].__put_nfs_open_context.nfs_file_clear_open_context.nfs_file_release.__fput.____fput.task_work_run
>       3449 ± 18%      7e+03      10001 ± 68%  latency_stats.sum.io_schedule.wait_on_page_bit_common.__filemap_fdatawait_range.filemap_fdatawait_keep_errors.sync_inodes_sb.sync_inodes_one_sb.iterate_supers.sys_sync.entry_SYSCALL_64_fastpath
>        378 ±123%      6e+03       6438 ±109%  latency_stats.sum.io_schedule.__lock_page.find_lock_entry.shmem_getpage_gfp.shmem_fault.__do_fault.__handle_mm_fault.handle_mm_fault.__do_page_fault.do_page_fault.page_fault
>        386 ± 34%      3e+03       3242 ±153%  latency_stats.sum.rpc_wait_bit_killable.__rpc_wait_for_completion_task.nfs4_run_open_task.[nfsv4].nfs4_do_open.[nfsv4].nfs4_atomic_open.[nfsv4].nfs4_file_open.[nfsv4].do_dentry_open.vfs_open.path_openat.do_filp_open.do_sys_open.SyS_open
>        413 ± 19%      3e+03       3256 ±156%  latency_stats.sum.io_schedule.wait_on_page_bit_common.__filemap_fdatawait_range.filemap_write_and_wait_range.nfs_file_fsync.vfs_fsync_range.vfs_fsync.nfs4_file_flush.[nfsv4].filp_close.do_dup2.SyS_dup2.entry_SYSCALL_64_fastpath
>       2874 ±173%     -3e+03          0        latency_stats.sum.call_rwsem_down_write_failed.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
>       3683 ±172%     -4e+03          5 ±100%  latency_stats.sum.call_rwsem_down_read_failed.lookup_slow.walk_component.path_lookupat.filename_lookup.user_path_at_empty.SyS_chmod.entry_SYSCALL_64_fastpath
>      12245 ±110%     -7e+03       5367 ± 29%  latency_stats.sum.call_usermodehelper_exec.__request_module.get_fs_type.do_mount.SyS_mount.entry_SYSCALL_64_fastpath
>       7320 ±136%     -7e+03          0        latency_stats.sum.call_rwsem_down_write_failed.do_truncate.do_sys_ftruncate.SyS_ftruncate.entry_SYSCALL_64_fastpath
>      52297 ±124%     -3e+04      26320 ± 16%  latency_stats.sum.call_rwsem_down_read_failed.lookup_slow.walk_component.link_path_walk.path_parentat.filename_parentat.do_unlinkat.SyS_unlink.entry_SYSCALL_64_fastpath
>      64452 ± 83%     -3e+04      31383 ± 30%  latency_stats.sum.call_rwsem_down_write_failed.do_rmdir.SyS_rmdir.entry_SYSCALL_64_fastpath
>    1867567 ±173%     -2e+06      11702 ±100%  latency_stats.sum.tty_release_struct.tty_release.__fput.____fput.task_work_run.exit_to_usermode_loop.syscall_return_slowpath.entry_SYSCALL_64_fastpath
> 
> 
> 
> 
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
> 
> 
> Thanks,
> Xiaolong

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ