[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080605162209.GG27370@duck.suse.cz>
Date: Thu, 5 Jun 2008 18:22:09 +0200
From: Jan Kara <jack@...e.cz>
To: "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
Cc: Jan Kara <jack@...e.cz>, cmm@...ibm.com, linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: Fix delalloc sync hang with journal lock
inversion
On Thu 05-06-08 19:24:13, Aneesh Kumar K.V wrote:
> > Hmm, I've just got an idea that it may be better to introduce a new flag
> > for wbc like range_cont and it would mean that we start scan at
> > writeback_index (we use range_start if writeback_index is not set) and
> > end with range_end. That way we don't have to be afraid of interference
> > with other range_cyclic users and in principle, range_cyclic is originally
> > meant for other uses...
> >
>
> something like below ?. With this ext4_da_writepages have
>
> pgoff_t writeback_index = 0;
> .....
> if (!wbc->range_cyclic) {
> /*
> * If range_cyclic is not set force range_cont
> * and save the old writeback_index
> */
> wbc->range_cont = 1;
> writeback_index = mapping->writeback_index;
> mapping->writeback_index = 0;
> }
> ...
> mpage_da_writepages(..)
> ..
> if (writeback_index)
> mapping->writeback_index = writeback_index;
> return ret;
>
>
>
> mm: Add range_cont mode for writeback.
>
> From: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
>
> Filesystems like ext4 needs to start a new transaction in
> the writepages for block allocation. This happens with delayed
> allocation and there is limit to how many credits we can request
> from the journal layer. So we call write_cache_pages multiple
> times with wbc->nr_to_write set to the maximum possible value
> limitted by the max journal credits available.
>
> Add a new mode to writeback that enables us to handle this
> behaviour. If mapping->writeback_index is not set we use
> wbc->range_start to find the start index and then at the end
> of write_cache_pages we store the index in writeback_index. Next
> call to write_cache_pages will start writeout from writeback_index.
> Also we limit writing to the specified wbc->range_end.
>
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@...ux.vnet.ibm.com>
> ---
>
> include/linux/writeback.h | 1 +
> mm/page-writeback.c | 10 +++++++++-
> 2 files changed, 10 insertions(+), 1 deletions(-)
I like it. I'm only not sure whether there cannot be two users of
write_cache_pages() operating on the same mapping at the same time. Because
then they could alter writeback_index under each other and that would
probably result in unpleasant behavior. I think there can be two parallel
calls for example from sync_single_inode() and sync_page_range().
In that case we'd need something like writeback_index inside wbc (or
maybe just alter range_start automatically when range_cont is set?) so that
parallel callers do no influence each other.
Honza
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index f462439..0d8573e 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -63,6 +63,7 @@ struct writeback_control {
> unsigned for_writepages:1; /* This is a writepages() call */
> unsigned range_cyclic:1; /* range_start is cyclic */
> unsigned more_io:1; /* more io to be dispatched */
> + unsigned range_cont:1;
> };
>
> /*
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 789b6ad..014a9f2 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -882,6 +882,12 @@ int write_cache_pages(struct address_space *mapping,
> if (wbc->range_cyclic) {
> index = mapping->writeback_index; /* Start from prev offset */
> end = -1;
> + } else if (wbc->range_cont) {
> + if (!mapping->writeback_index)
> + index = wbc->range_start >> PAGE_CACHE_SHIFT;
> + else
> + index = mapping->writeback_index;
> + end = wbc->range_end >> PAGE_CACHE_SHIFT;
> } else {
> index = wbc->range_start >> PAGE_CACHE_SHIFT;
> end = wbc->range_end >> PAGE_CACHE_SHIFT;
> @@ -954,7 +960,9 @@ int write_cache_pages(struct address_space *mapping,
> index = 0;
> goto retry;
> }
> - if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
> + if (wbc->range_cyclic ||
> + (range_whole && wbc->nr_to_write > 0) ||
> + wbc->range_cont)
> mapping->writeback_index = index;
> return ret;
> }
--
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists