[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120126211526.GA28368@quack.suse.cz>
Date: Thu, 26 Jan 2012 22:15:26 +0100
From: Jan Kara <jack@...e.cz>
To: Jeff Moyer <jmoyer@...hat.com>
Cc: linux-ext4@...r.kernel.org, Jan Kara <jack@...e.cz>
Subject: Re: [patch] ext4: fix race between unwritten extent conversion and
truncate
On Thu 26-01-12 15:59:11, Jeff Moyer wrote:
> Hi,
>
> The following comment in ext4_end_io_dio caught my attention:
>
> /* XXX: probably should move into the real I/O completion handler */
> inode_dio_done(inode);
>
> The truncate code takes i_mutex, then calls inode_dio_wait. Because the
> ext4 code path above will end up dropping the mutex before it is
> reacquired by the worker thread that does the extent conversion, it
> seems to me that the truncate can happen out of order. Jan Kara
> mentioned that this might result in extra journal I/O, which isn't nice,
Funny misunderstanding ;) I meant we will complain to system log with error
messages / WARN_ON.
> but that's probably the full extent of the "damage."
>
> The fix is pretty straight-forward: don't call inode_dio_done until the
> extent conversion is complete.
Otherwise the patch looks good so:
Reviewed-by: Jan Kara <jack@...e.cz>
Honza
>
> Signed-off-by: Jeff Moyer <jmoyer@...hat.com>
> CC: Jan Kara <jack@...e.cz>
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 513004f..2d55d7c 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -184,6 +184,7 @@ struct mpage_da_data {
> #define EXT4_IO_END_UNWRITTEN 0x0001
> #define EXT4_IO_END_ERROR 0x0002
> #define EXT4_IO_END_QUEUED 0x0004
> +#define EXT4_IO_END_DIRECT 0x0008
>
> struct ext4_io_page {
> struct page *p_page;
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index feaa82f..f6dc02b 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2795,9 +2795,6 @@ out:
>
> /* queue the work to convert unwritten extents to written */
> queue_work(wq, &io_end->work);
> -
> - /* XXX: probably should move into the real I/O completion handler */
> - inode_dio_done(inode);
> }
>
> static void ext4_end_io_buffer_write(struct buffer_head *bh, int uptodate)
> @@ -2921,9 +2918,12 @@ static ssize_t ext4_ext_direct_IO(int rw, struct kiocb *iocb,
> iocb->private = NULL;
> EXT4_I(inode)->cur_aio_dio = NULL;
> if (!is_sync_kiocb(iocb)) {
> - iocb->private = ext4_init_io_end(inode, GFP_NOFS);
> - if (!iocb->private)
> + ext4_io_end_t *io_end =
> + ext4_init_io_end(inode, GFP_NOFS);
> + if (!io_end)
> return -ENOMEM;
> + io_end->flag |= EXT4_IO_END_DIRECT;
> + iocb->private = io_end;
> /*
> * we save the io structure for current async
> * direct IO, so that later ext4_map_blocks()
> diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
> index 4758518..9e1b8eb 100644
> --- a/fs/ext4/page-io.c
> +++ b/fs/ext4/page-io.c
> @@ -110,6 +110,8 @@ int ext4_end_io_nolock(ext4_io_end_t *io)
> if (io->iocb)
> aio_complete(io->iocb, io->result, 0);
>
> + if (io->flag & EXT4_IO_END_DIRECT)
> + inode_dio_done(inode);
> /* Wake up anyone waiting on unwritten extent conversion */
> if (atomic_dec_and_test(&EXT4_I(inode)->i_aiodio_unwritten))
> wake_up_all(ext4_ioend_wq(io->inode));
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists