[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090927101035.e8712819.akpm@linux-foundation.org>
Date: Sun, 27 Sep 2009 10:10:35 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Jens Axboe <jens.axboe@...cle.com>
Cc: Chris Mason <chris.mason@...cle.com>, linux-kernel@...r.kernel.org,
jack@...e.cz
Subject: Re: [PATCH] bdi_sync_writeback should WB_SYNC_NONE first
On Sun, 27 Sep 2009 18:55:14 +0200 Jens Axboe <jens.axboe@...cle.com> wrote:
> > I wasn't referring to this patch actually. The code as it stands in
> > Linus's tree right now attempts to write back up to 2^63 pages...
>
> I agree, it could make the fs sync take a looong time. This is not a new
> issue, though.
It _should_ be a new issue. The old code would estimate the number of
dirty pages up-front and would then add a +50% fudge factor, so if we
started the sync with 1GB dirty memory, we write back a max of 1.5GB.
However that might have got broken.
void sync_inodes_sb(struct super_block *sb, int wait)
{
struct writeback_control wbc = {
.sync_mode = wait ? WB_SYNC_ALL : WB_SYNC_NONE,
.range_start = 0,
.range_end = LLONG_MAX,
};
if (!wait) {
unsigned long nr_dirty = global_page_state(NR_FILE_DIRTY);
unsigned long nr_unstable = global_page_state(NR_UNSTABLE_NFS);
wbc.nr_to_write = nr_dirty + nr_unstable +
(inodes_stat.nr_inodes - inodes_stat.nr_unused);
} else
wbc.nr_to_write = LONG_MAX; /* doesn't actually matter */
sync_sb_inodes(sb, &wbc);
}
a) the +50% isn't there in 2.6.31
b) the wait=true case appears to be vulnerable to livelock in 2.6.31.
whodidthat
38f21977663126fef53f5585e7f1653d8ebe55c4 did that back in January.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists