[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110621054056.GP32466@dastard>
Date: Tue, 21 Jun 2011 15:40:56 +1000
From: Dave Chinner <david@...morbit.com>
To: Christoph Hellwig <hch@...radead.org>
Cc: viro@...iv.linux.org.uk, tglx@...utronix.de,
linux-fsdevel@...r.kernel.org, linux-ext4@...r.kernel.org,
linux-btrfs@...r.kernel.org, hirofumi@...l.parknet.co.jp,
mfasheh@...e.com, jlbec@...lplan.org
Subject: Re: [PATCH 4/8] fs: kill i_alloc_sem
On Mon, Jun 20, 2011 at 04:15:37PM -0400, Christoph Hellwig wrote:
> i_alloc_sem is a rather special rw_semaphore. It's the last one that may
> be released by a non-owner, and it's write side is always mirrored by
> real exclusion. It's intended use it to wait for all pending direct I/O
> requests to finish before starting a truncate.
>
> Replace it with a hand-grown construct:
>
> - exclusion for truncates is already guaranteed by i_mutex, so it can
> simply fall way
> - the reader side is replaced by an i_dio_count member in struct inode
> that counts the number of pending direct I/O requests. Truncate can't
> proceed as long as it's non-zero
> - when i_dio_count reaches non-zero we wake up a pending truncate using
> wake_up_bit on a new bit in i_flags
> - new references to i_dio_count can't appear while we are waiting for
> it to read zero because the direct I/O count always needs i_mutex
> (or an equivalent like XFS's i_iolock) for starting a new operation.
>
> This scheme is much simpler, and saves the space of a spinlock_t and a
> struct list_head in struct inode (typically 160 bytes on a non-debug 64-bit
> system).
>
> Signed-off-by: Christoph Hellwig <hch@....de>
>
> Index: linux-2.6/fs/direct-io.c
> ===================================================================
> --- linux-2.6.orig/fs/direct-io.c 2011-06-20 14:55:31.000000000 +0200
> +++ linux-2.6/fs/direct-io.c 2011-06-20 14:55:34.602490284 +0200
> @@ -136,6 +136,27 @@ struct dio {
> };
>
> /*
> + * Wait for outstanding DIO requests to finish. Must be locked against
> + * increments of i_dio_count by i_mutex.
> + */
> +void inode_dio_wait(struct inode *inode)
> +{
> + might_sleep();
> + while (atomic_read(&inode->i_dio_count)) {
> + wait_on_bit(&inode->i_state, __I_DIO_WAKEUP, inode_wait,
> + TASK_UNINTERRUPTIBLE);
> + }
> +}
> +EXPORT_SYMBOL_GPL(inode_dio_wait);
> +
> +void inode_dio_wake(struct inode *inode)
> +{
> + if (atomic_dec_and_test(&inode->i_dio_count))
> + wake_up_bit(&inode->i_state, __I_DIO_WAKEUP);
> +}
> +EXPORT_SYMBOL_GPL(inode_dio_wake);
Modification of inode->i_state is not safe outside the
inode->i_lock.
This probably needs to be implemented similar to the
__I_NEW/__wait_on_freeing_inode() and
__I_SYNC/inode_wait_for_writeback() pattern...
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists