[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130204210204.GB12826@thunk.org>
Date: Mon, 4 Feb 2013 16:02:04 -0500
From: Theodore Ts'o <tytso@....edu>
To: Ext4 Developers List <linux-ext4@...r.kernel.org>
Cc: "Barry J. Marson" <bmarson@...hat.com>
Subject: Re: [PATCH 2/3] jbd2: commit as soon as possible after
log_start_commit
On Thu, Jan 31, 2013 at 12:53:07PM -0500, Theodore Ts'o wrote:
> Once a transaction has been requested to be committed, don't let any
> other handles start under that transaction, and don't allow any
> pending transactions to be extended (i.e., in the case of
> unlink/ftruncate).
>
> The idea is once the transaction has had log_start_commit() called on
> it, at least one thread is blocked waiting for that transaction to
> commit, and over time, more and more threads will end up getting
> blocked. In order to avoid high variability in file system operations
> getting blocked behind the a blocked start_this_handle(), we should
> try to get the commit started as soon as possible.
I'm going to drop this patch because thanks to some performance
measurement works by Barry Marson at Red Hat, it shows that this patch
apparently makes things worse by 17% with AIM7.
At a guess, it looks like some AIM7 has enough threads competing for
the CPU that it can take a good 80ms or more before kjournald can get
scheduled, and then start locking down the transaction. During those
80ms, a number of short transactions get in and out and make forward
progress on the benchmark. However, there are some "long pole"
handles that will take up to 100-250ms to complete, and during that
time, nothing else can get done. By starting to lock down the
transaction as soon as the commit is requested, instead of waiting
until the kjournald thread can be scheduled, we end up limiting
forward progress made by the quick "in and out" handles, and this far
outweighs the benefits of stopping a long-lived handle from getting
started.
It may be that after we do some work to conclusively identify what
these "long pole" handles might be --- I strongly suspect they come
from truncate/unlink calls, but we need to make sure. If in fact they
are coming from truncate/unlink calls, it might be possible to let a
truncate processing stop its handle once it notices that we are
requesting that a journal commit should start, so we can lower the
average amount of time start_this_handle() gets blocked waiting for a
commit to complete.
Thanks again to Barry for doing the benchmarking work! As always,
people who provide us with benchmarking support provide an absolutely
invaluable service.
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists