lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160705033824.GD15193@thunk.org>
Date:	Mon, 4 Jul 2016 23:38:24 -0400
From:	Theodore Ts'o <tytso@....edu>
To:	Jan Kara <jack@...e.cz>
Cc:	linux-ext4@...r.kernel.org, Eryu Guan <eguan@...hat.com>,
	stable@...r.kernel.org
Subject: Re: [PATCH 1/4] ext4: Fix deadlock during page writeback

On Mon, Jul 04, 2016 at 05:51:07PM +0200, Jan Kara wrote:
> On Mon 04-07-16 10:14:35, Ted Tso wrote:
> > This is what I'm currently testing; do you have objections to this?
> 
> Meh, I don't like it but it should work... Did you see any improvement with
> your change or are you just operating on the assumption that you want as
> few code while the handle is running as possible?

I haven't had a chance to try to benchmark it yet.  I've working at
home over the long (US) holiday weekend, and the high core-count
servers I need are on the internal work network, and it's pain to
access them from home.

I've just been tired of seeing the sort of analysis that can be found
at papers like:

https://www.usenix.org/system/files/conference/fast14/fast14-paper_kang.pdf

(And there's a ATC 2016 paper which shows that things haven't gotten
any better as well.)

Given that our massive lock bottlenecks come from the j_list_lock and
j_state_lock, and that most of the contention happens when we are
closing down a transaction for a commit, there is a pretty direct
correlation between handle lifetimes and the contention statistics on
the journal spinlocks.  Enough so that I've instrumented the handle
type and handle line number in the jbd2_handle_stats tracepoint, and
work to push down on the handle hold times have definitely helped our
contention numbers.

So I do have experimental evidence that reducing code while the handle
is running does matter in general.  I just don't have it for this
specific case yet....

Cheers,

							- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ