lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130204210204.GB12826@thunk.org>
Date:	Mon, 4 Feb 2013 16:02:04 -0500
From:	Theodore Ts'o <tytso@....edu>
To:	Ext4 Developers List <linux-ext4@...r.kernel.org>
Cc:	"Barry J. Marson" <bmarson@...hat.com>
Subject: Re: [PATCH 2/3] jbd2: commit as soon as possible after
 log_start_commit

On Thu, Jan 31, 2013 at 12:53:07PM -0500, Theodore Ts'o wrote:
> Once a transaction has been requested to be committed, don't let any
> other handles start under that transaction, and don't allow any
> pending transactions to be extended (i.e., in the case of
> unlink/ftruncate).
> 
> The idea is once the transaction has had log_start_commit() called on
> it, at least one thread is blocked waiting for that transaction to
> commit, and over time, more and more threads will end up getting
> blocked.  In order to avoid high variability in file system operations
> getting blocked behind the a blocked start_this_handle(), we should
> try to get the commit started as soon as possible.

I'm going to drop this patch because thanks to some performance
measurement works by Barry Marson at Red Hat, it shows that this patch
apparently makes things worse by 17% with AIM7.

At a guess, it looks like some AIM7 has enough threads competing for
the CPU that it can take a good 80ms or more before kjournald can get
scheduled, and then start locking down the transaction.  During those
80ms, a number of short transactions get in and out and make forward
progress on the benchmark.  However, there are some "long pole"
handles that will take up to 100-250ms to complete, and during that
time, nothing else can get done.  By starting to lock down the
transaction as soon as the commit is requested, instead of waiting
until the kjournald thread can be scheduled, we end up limiting
forward progress made by the quick "in and out" handles, and this far
outweighs the benefits of stopping a long-lived handle from getting
started.

It may be that after we do some work to conclusively identify what
these "long pole" handles might be --- I strongly suspect they come
from truncate/unlink calls, but we need to make sure.  If in fact they
are coming from truncate/unlink calls, it might be possible to let a
truncate processing stop its handle once it notices that we are
requesting that a journal commit should start, so we can lower the
average amount of time start_this_handle() gets blocked waiting for a
commit to complete.

Thanks again to Barry for doing the benchmarking work!  As always,
people who provide us with benchmarking support provide an absolutely
invaluable service.

					- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ