lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130912190251.GB28067@thunk.org>
Date:	Thu, 12 Sep 2013 15:02:51 -0400
From:	Theodore Ts'o <tytso@....edu>
To:	Cuong Tran <cuonghuutran@...il.com>
Cc:	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: Java Stop-the-World GC stall induced by FS flush or many large
 file deletions

Are you absolutely certain your JVM attempting to write to any files
in its GC thread?  Say, to do some kind of logging?  It might be worth
stracing the JVM and correlating the GC stall with any syscalls that
might have been issued from the JVM GC thread.

Especially in the case of the FS Flush, the writeback thread isn't CPU
bound.  It will wait for the writeback to complete, but while it's
waiting, other processes or threads will be allowed to run on the CPU.

Now, if the GC thread tries to do some kind of fs operation which
requires writing to the file system, and the file sytstem is trying to
start a jbd transaction commit, file system operations can block until
all of the jbd handles associated with the previous commit can
complete.  If you are storage devices are slow, or you are using a
block cgroup to control how much I/O bandwidth a particular cgroup
could use, this can end up causing a priority inversion where a low
priority cgroup takes a while to complete, this can stall the jbd
commit completion, and this can cause new ext4 operations can stall
waiting to start a new jbd handle.

So you could have a stall happening, if it's taking a long time for
commits to complete, but it might be completely unrelated to a GC
stall.

If you enable the jbd2_run_stats tracepoint, you can get some
interesting numbers about how long the various phases of the jbd2
commit are taking.

              				- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ