linux-kernel - Re: [PATCH 09/13] writeback: requeue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:	Thu, 17 Jan 2008 12:22:38 +0800
From:	Fengguang Wu <wfg@...l.ustc.edu.cn>
To:	David Chinner <dgc@....com>
Cc:	Andrew Morton <akpm@...l.org>, Michael Rubin <mrubin@...gle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 09/13] writeback: requeue_io() on redirtied inode

On Wed, Jan 16, 2008 at 07:13:07PM +1100, David Chinner wrote:
> On Tue, Jan 15, 2008 at 08:36:46PM +0800, Fengguang Wu wrote:
> > Redirtied inodes could be seen in really fast writes.
> > They should really be synced as soon as possible.
> > 
> > redirty_tail() could delay the inode for up to 30s.
> > Kill the delay by using requeue_io() instead.
> 
> That's actually bad for anything that does delayed allocation
> or updates state on data I/o completion.
> 
> e.g. XFS when writing past EOF doing delalloc dirties the inode
> during writeout (allocation) and then updates the file size on data
> I/o completion hence dirtying the inode again.
> 
> With this change, writing the last pages out would result
> in hitting this code and causing the inode to be flushed very
> soon after the data write. Then, after the inode write is issued,
> we get data I/o completion which dirties the inode again,
> resulting in needing to write the inode again to clean it.
> i.e. it introduces a potential new and useless inode write
> I/O.
> 
> Also, the immediate inode write may be useless for XFS because the
> inode may be pinned in memory due to async transactions
> still in flight (e.g. from delalloc) so we've got two
> situations where flushing the inode immediately is suboptimal.
> 
> Hence I don't think this is an optimisation that should be made
> in the generic writeback code.

Thanks for the explanation.
I can confirm that many requeue_io() happened for the same XFS inode:
[  158.794562] requeue_io 328: inode 5243009 size 34647 at 03:03(hda3)
[  158.794827] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14209 global 486 10 0 wc _M tw 1013 sk 0
[  158.795293] requeue_io 328: inode 5243009 size 34647 at 03:03(hda3)
[  158.795313] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14198 global 486 10 0 wc _M tw 1024 sk 0
...
[  170.713900] requeue_io 328: inode 5243009 size 34647 at 03:03(hda3)
[  170.713925] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14198 global 1875 0 0 wc _M tw 1024 sk 0
[  170.813584] mm/page-writeback.c 668 wb_kupdate: pdflush(183) 14198 global 2855 0 0 wc __ tw 1024 sk 0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/