[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080611124157.GB8121@duck.suse.cz>
Date: Wed, 11 Jun 2008 14:41:57 +0200
From: Jan Kara <jack@...e.cz>
To: "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
Cc: cmm@...ibm.com, linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: Fix delalloc sync hang with journal lock
inversion
On Fri 06-06-08 00:49:09, Aneesh Kumar K.V wrote:
> On Thu, Jun 05, 2008 at 06:22:09PM +0200, Jan Kara wrote:
> > I like it. I'm only not sure whether there cannot be two users of
> > write_cache_pages() operating on the same mapping at the same time. Because
> > then they could alter writeback_index under each other and that would
> > probably result in unpleasant behavior. I think there can be two parallel
> > calls for example from sync_single_inode() and sync_page_range().
> > In that case we'd need something like writeback_index inside wbc (or
> > maybe just alter range_start automatically when range_cont is set?) so that
> > parallel callers do no influence each other.
> >
>
> commit e56edfdeea0d336e496962782f08e1224a101cf2
> Author: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
> Date: Fri Jun 6 00:47:35 2008 +0530
>
> mm: Add range_cont mode for writeback.
>
> Filesystems like ext4 needs to start a new transaction in
> the writepages for block allocation. This happens with delayed
> allocation and there is limit to how many credits we can request
> from the journal layer. So we call write_cache_pages multiple
> times with wbc->nr_to_write set to the maximum possible value
> limitted by the max journal credits available.
>
> Add a new mode to writeback that enables us to handle this
> behaviour. If mapping->writeback_index is not set we use
> wbc->range_start to find the start index and then at the end
> of write_cache_pages we store the index in writeback_index. Next
> call to write_cache_pages will start writeout from writeback_index.
> Also we limit writing to the specified wbc->range_end.
I think this changelog is out of date...
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
>
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index f462439..0d8573e 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -63,6 +63,7 @@ struct writeback_control {
> unsigned for_writepages:1; /* This is a writepages() call */
> unsigned range_cyclic:1; /* range_start is cyclic */
> unsigned more_io:1; /* more io to be dispatched */
> + unsigned range_cont:1;
> };
>
> /*
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 789b6ad..182233b 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -882,6 +882,9 @@ int write_cache_pages(struct address_space *mapping,
> if (wbc->range_cyclic) {
> index = mapping->writeback_index; /* Start from prev offset */
> end = -1;
> + } else if (wbc->range_cont) {
> + index = wbc->range_start >> PAGE_CACHE_SHIFT;
> + end = wbc->range_end >> PAGE_CACHE_SHIFT;
Hmm, why isn't this in the next else?
> } else {
> index = wbc->range_start >> PAGE_CACHE_SHIFT;
> end = wbc->range_end >> PAGE_CACHE_SHIFT;
> @@ -956,6 +959,9 @@ int write_cache_pages(struct address_space *mapping,
> }
> if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
> mapping->writeback_index = index;
> +
> + if (wbc->range_cont)
> + wbc->range_start = index << PAGE_CACHE_SHIFT;
> return ret;
> }
> EXPORT_SYMBOL(write_cache_pages);
Honza
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists