Debug traces show that in per-bdi writeback, the inode under writeback almost always get redirtied by a busy dirtier. We used to call redirty_tail() in this case, which could delay inode for up to 30s. This is unacceptable because it now happens so frequently for plain cp/dd, that the accumulated delays could make writeback of big files very slow. So let's distinguish between data redirty and metadata only redirty. The first one is caused by a busy dirtier, while the latter one could happen in XFS, NFS, etc. when they are doing delalloc or updating isize. The inode being busy dirtied will now be requeued for next io, while the inode being redirtied by fs will continue to be delayed to avoid repeated IO. CC: Jan Kara CC: Theodore Ts'o CC: Dave Chinner CC: Jens Axboe CC: Chris Mason CC: Christoph Hellwig Signed-off-by: Wu Fengguang --- fs/fs-writeback.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) --- linux.orig/fs/fs-writeback.c 2009-09-23 16:13:41.000000000 +0800 +++ linux/fs/fs-writeback.c 2009-09-23 16:21:24.000000000 +0800 @@ -491,10 +493,15 @@ writeback_single_inode(struct inode *ino spin_lock(&inode_lock); inode->i_state &= ~I_SYNC; if (!(inode->i_state & (I_FREEING | I_CLEAR))) { - if (inode->i_state & I_DIRTY) { + if (inode->i_state & I_DIRTY_PAGES) { /* - * Someone redirtied the inode while were writing back - * the pages. + * More pages get dirtied by a fast dirtier. + */ + goto select_queue; + } else if (inode->i_state & I_DIRTY) { + /* + * At least XFS will redirty the inode during the + * writeback (delalloc) and on io completion (isize). */ redirty_tail(inode); } else if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/