[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110213054235.GD2598@thunk.org>
Date: Sun, 13 Feb 2011 00:42:35 -0500
From: Ted Ts'o <tytso@....edu>
To: Josef Bacik <josef@...hat.com>
Cc: Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH,RFC 1/7] ext4: fold __mpage_da_writepage() into
write_cache_pages_da()
On Sat, Feb 12, 2011 at 08:25:29PM -0500, Josef Bacik wrote:
> > +out:
> > + pagevec_release(&pvec);
> > + cond_resched();
> > + return ret;
> > }
>
> Do we really need the cond_resched() here? Seems like it will just add
> unwanted/uneeded latencies.
The cond_resched is from the original write_cache_pages(), and if you
follow the code movement, it goes all the way back to fs/mpage.c's
__mpage_writepages() from 2.6.11 (the beginning of time as far as the
Linux 2.6's git repository is concerned).
The basic idea is that given that writeback threads are basically
running in a tight loop trying to push out dirty pages, you need to
eventually give other processes a chance to run --- especially on a UP
system! I do wonder whether we are checking way too much, though.
The cond_resched() I'd be tempted to take out is not the one at the
end of the function, but the one at the end of the while loop.
That would allow us to complete the the writeback for a particular
inode before letting another process run, which would trade off
efficiency for a bit more scheduling unfairness. But given that a
particular writeback call is capped at writing out a relatively small
mount of data anyway, that would seem to be OK.
But even XFS has a cond_resched in xfs_cluster_write() (in
fs/xfs/linux-2.6/xfs_aops.c) so I'd want to do a lot of thinking,
testing, and benchmarking before removing that call to cond_resched().
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists