[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200801311156.01768.chris.mason@oracle.com>
Date: Thu, 31 Jan 2008 11:56:01 -0500
From: Chris Mason <chris.mason@...cle.com>
To: Al Boldi <a1426z@...ab.com>
Cc: Andreas Dilger <adilger@....com>, Jan Kara <jack@...e.cz>,
Chris Snook <csnook@...hat.com>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC] ext3: per-process soft-syncing data=ordered mode
On Thursday 31 January 2008, Al Boldi wrote:
> Andreas Dilger wrote:
> > On Wednesday 30 January 2008, Al Boldi wrote:
> > > And, a quick test of successive 1sec delayed syncs shows no hangs until
> > > about 1 minute (~180mb) of db-writeout activity, when the sync abruptly
> > > hangs for minutes on end, and io-wait shows almost 100%.
> >
> > How large is the journal in this filesystem? You can check via
> > "debugfs -R 'stat <8>' /dev/XXX".
>
> 32mb.
>
> > Is this affected by increasing
> > the journal size? You can set the journal size via "mke2fs -J size=400"
> > at format time, or on an unmounted filesystem by running
> > "tune2fs -O ^has_journal /dev/XXX" then "tune2fs -J size=400 /dev/XXX".
>
> Setting size=400 doesn't help, nor does size=4.
>
> > I suspect that the stall is caused by the journal filling up, and then
> > waiting while the entire journal is checkpointed back to the filesystem
> > before the next transaction can start.
> >
> > It is possible to improve this behaviour in JBD by reducing the amount
> > of space that is cleared if the journal becomes "full", and also doing
> > journal checkpointing before it becomes full. While that may reduce
> > performance a small amount, it would help avoid such huge latency
> > problems. I believe we have such a patch in one of the Lustre branches
> > already, and while I'm not sure what kernel it is for the JBD code rarely
> > changes much....
>
> The big difference between ordered and writeback is that once the slowdown
> starts, ordered goes into ~100% iowait, whereas writeback continues 100%
> user.
Does data=ordered write buffers in the order they were dirtied? This might
explain the extreme problems in transactional workloads.
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists