[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20111029205740.GB16825@thunk.org>
Date: Sat, 29 Oct 2011 16:57:40 -0400
From: Ted Ts'o <tytso@....edu>
To: Tao Ma <tm@....ma>
Cc: linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: Check io list state and avoid an unnecessary
mutex_lock in ext4_end_io_work.
On Wed, Sep 14, 2011 at 04:33:02PM +0800, Tao Ma wrote:
> From: Tao Ma <boyu.mt@...bao.com>
>
> When we finish the end io work in ext4_flush_completed_IO, we take
> the io work away from the list, but don't free it. Then in the workqueue,
> we can check the list state and then avoid the extra work if it is also
> done. It is good, but we check list state in ext4_end_io_nolock with i_mutex held
> instead of the spin_lock in other places. This is wrong.
Hi Tao,
Thanks for finding and pointing out this bug in ext4_end_io_nolock().
Unfortunately your proposed fix doesn't take into account that there
are other callers of ext4_end_io_nolock() besides ext4_end_io_work(),
and so it's not sufficient to move the test from former function to
the latter.
Attached please find the patch which I am planning to check into the
ext4 tree to address this bug which you have pointed out.
Regards,
- Ted
commit c0e36d8410bfad4db4edefeb4175f85a5d216c8d
Author: Theodore Ts'o <tytso@....edu>
Date: Sat Oct 29 16:57:19 2011 -0400
ext4: use spinlock before checking io->list in ext4_io_end_nolock()
In ext4_end_io_nolock(), io->list is checked to see if it is empty
without taking the ei->i_completed_io_lock spinlock. This violates
the locking protocol, and can cause very hard to debug failures.
Also optimize ext4_end_io_work() so that if ext4_end_io_nolock() is
not going to do any work, don't try getting the i_mutex and possibly
requeuing the end_io request if the trylock doesn't succeed in grabbing
the mutex.
Thanks to Tao Ma <boyu.mt@...bao.com> for spotting the error and
providing an initial fix to address the problem.
Signed-off-by: "Theodore Ts'o" <tytso@....edu>
Signed-off-by: Tao Ma <boyu.mt@...bao.com>
diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 92f38ee..0af5607 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -90,21 +90,27 @@ void ext4_free_io_end(ext4_io_end_t *io)
*/
int ext4_end_io_nolock(ext4_io_end_t *io)
{
- struct inode *inode = io->inode;
- loff_t offset = io->offset;
- ssize_t size = io->size;
- wait_queue_head_t *wq;
- int ret = 0;
+ unsigned long flags;
+ struct inode *inode = io->inode;
+ loff_t offset = io->offset;
+ ssize_t size = io->size;
+ struct ext4_inode_info *ei = EXT4_I(inode);
+ wait_queue_head_t *wq;
+ int ret;
ext4_debug("ext4_end_io_nolock: io 0x%p from inode %lu,list->next 0x%p,"
"list->prev 0x%p\n",
io, inode->i_ino, io->list.next, io->list.prev);
- if (list_empty(&io->list))
- return ret;
+ spin_lock_irqsave(&ei->i_completed_io_lock, flags);
+ if (list_empty(&io->list)) {
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
+ return 0;
+ }
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
if (!(io->flag & EXT4_IO_END_UNWRITTEN))
- return ret;
+ return 0;
ret = ext4_convert_unwritten_extents(inode, offset, size);
if (ret < 0) {
@@ -142,6 +148,16 @@ static void ext4_end_io_work(struct work_struct *work)
unsigned long flags;
int ret;
+ if (!(io->flag & EXT4_IO_END_UNWRITTEN))
+ goto free_io_end;
+
+ spin_lock_irqsave(&ei->i_completed_io_lock, flags);
+ if (list_empty(&io->list)) {
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
+ goto free_io_end;
+ }
+ spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
+
if (!mutex_trylock(&inode->i_mutex)) {
/*
* Requeue the work instead of waiting so that the work
@@ -170,6 +186,7 @@ static void ext4_end_io_work(struct work_struct *work)
list_del_init(&io->list);
spin_unlock_irqrestore(&ei->i_completed_io_lock, flags);
mutex_unlock(&inode->i_mutex);
+free_io_end:
ext4_free_io_end(io);
}
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists