lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <97F0BDAD-B6D8-4246-B790-E269025F4A7D@dilger.ca>
Date:	Mon, 30 Jan 2012 13:36:14 -0700
From:	Andreas Dilger <adilger@...ger.ca>
To:	Kazuya Mio <k-mio@...jp.nec.com>
Cc:	ext4 <linux-ext4@...r.kernel.org>, Jan Kara <jack@...e.cz>
Subject: Re: [PATCH 0/2] ext3: Reduce calling ext3_mark_inode_dirty() for speedup

On 2012-01-30, at 1:41 AM, Kazuya Mio wrote:
> ext3 has a performance problem that parallel write is too slow.
> I looked into this and found out that ext3 calls ext3_mark_inode_dirty()
> unnecessarily.
> 
> The following result is the time of writing 16 files whose size are 3GB
> by 16 threads. This measurement was performed in linux3.3-rc1 with
> 4-way server, 512GB memory.
> 
>    filesystem        time(sec)  call ext3_mark_inode_dirty(times)
>    ---
>    ext3              220.5      50,338,104
>    ext3 (patched)    196.3      25,169,658
>    ext4 (*1)         190.3      28,465,799
> 
>    *1 disable ext4-specific option (delalloc, extent, and so on)

Can you please run this same measurement on ext4 formatted and running
with the default options?  I'd like to know if this is still a problem
in ext4 or not.


There is a better mechanism to handle the inode updates that could be
implemented if there is still a real performance concern.  There are
journal pre-commit callbacks on the buffer heads that could be run to
copy the modified data from the VFS inodes to the buffer blocks.

This would reduce the ext4_mark_inode_dirty() to setting a single dirty
flag in the inode, and updating the VFS inode ctime.  Only once per
journal commit would the VFS inode be copied into the buffer, greatly
reducing the overhead of these operations.  This should also noticeably
reduce the overhead from metadata checksums, since the checksum would
only be computed once for each inode per journal commit.

> ext3 in RHEL5.5 clearly shows the difference in performance.
> Writing by the same method takes 533 seconds, though writing by one thread
> takes 191 seconds.
> 
> Every time we write one page, ext3 calls ext3_mark_inode_dirty() four times.
> Two of these are unnecessary in many case, so I add the conditions to call
> the function only when it is necessary.
> 
>      sys_write
>        ...
>          __generic_file_aio_write
>            file_update_time
>              mark_inode_dirty_sync
>            generic_file_buffered_write
>              ...
>                ext3_get_blocks_handle
>                  ext3_write_begin
>                    ...
>                      ext3_new_blocks
>                        vfs_dq_alloc_block
>    1)                    mark_inode_dirty
>                        vfs_dq_free_block
>    2)                    mark_inode_dirty      <-- patch 1/2
>                    ext3_splice_branch
>    3)                ext3_mark_inode_dirty     <-- patch 2/2
>                  ext3_ordered_write_end
>                    update_file_sizes
>    4)                mark_inode_dirty
> 
> Regards,
> Kazuya Mio


Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ