lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150615233519.GB30059@thunk.org>
Date:	Mon, 15 Jun 2015 19:35:19 -0400
From:	Theodore Ts'o <tytso@....edu>
To:	Tejun Heo <tj@...nel.org>
Cc:	Vivek Goyal <vgoyal@...hat.com>, axboe@...nel.dk,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	lizefan@...wei.com, cgroups@...r.kernel.org
Subject: Re: [PATCH 3/3] writeback, blkio: add documentation for cgroup
 writeback support

On Mon, Jun 15, 2015 at 02:23:45PM -0400, Tejun Heo wrote:
> 
> On ext2, there's nothing interlocking each other.  My understanding of
> ext4 is pretty limited but as long as the journal head doesn't
> overwrap and gets bloked on the slow one, it should be fine, so for
> most use cases, this shouldn't be a problem.

The writes to the journal in ext3/ext4 are done from the jbd/jbd2
kernel thread.  So writes to the journal shouldn't be a problem.  In
data=ordered mode inodes that have blocks that were allocated during
the current transaction do have to have their data blocks written out,
and this is done by the jbd/jbd2 thread using filemap_fdatawait().

If this gets throttled because blocks were originally dirtied by some
cgroup that didn't have much disk time quota, then all file system
activities will get stalled out until the ordered mode writeback
completes, which means if there are any high priority cgroups trying
to execute any system call that mutates file system state will block
until the commit has gotten past the initial setup stage, and so other
system activity could sputter to a halt --- at which point the commit
will be allowed to compete, and then all of the calls to
ext4_journal_start() will unblock, and the system will come back to
life.  :-)

Because ext3 doesn't have delayed allocation, it will orders of
magnitude more data=ordered block flushing, so this problem will be
far worse with ext3 compared to ext4.

So if there is some way we can signal to any cgroup that that might be
throttling writeback or disk I/O that the jbd/jbd2 process should be
considered privileged, that would be a good since it would allow us to
avoid a potential priority inversion problem. 

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ