lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 14 Mar 2016 08:39:28 +0100
From:	Jan Kara <jack@...e.cz>
To:	Theodore Ts'o <tytso@....edu>
Cc:	Jan Kara <jack@...e.cz>,
	"HUANG Weller (CM/ESW12-CN)" <Weller.Huang@...bosch.com>,
	"linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>,
	"Li, Michael" <huayil@....qualcomm.com>
Subject: Re: ext4 out of order when use cfq scheduler

On Sat 12-03-16 23:27:23, Ted Tso wrote:
> On Thu, Jan 07, 2016 at 12:47:36PM +0100, Jan Kara wrote:
> > The problem is in all kernels starting with 3.8. Attached is a patch which
> > should fix the issue. Can you test whether it fixes the problem for you?
> 
> Sorry, I missed this patch because it was attached to an discussion
> thread.

I have actually sent this patch in a standalone thread on January 11
(http://lists.openwall.net/linux-ext4/2016/01/11/3) together with one more
cleanup.

> > The problem is that although for delayed allocated blocks we write their
> > contents immediately after allocating them, there is no guarantee that
> > the IO scheduler or device doesn't reorder things
> 
> I don't think that's the problem.  In the commit thread when we call
> blkdev_issue_flush() that acts as a barrier so the I/O scheduler won't
> reorder writes after that point, which is before we write the commit
> block.  Instead, I believe the problem is in ext4_writepages:
> 
> 		ext4_journal_stop(handle);
> 		/* Submit prepared bio */
> 		ext4_io_submit(&mpd.io_submit);
> 
> Once we release the handle, the commit can start --- *before* we have
> a chance to submit the I/O.   Oops.
> 
> I believe if we swap these two calls, it should fix the problem Huang
> was seeing.

No, that won't be enough. blkdev_issue_flush() is not guaranteed to do
anything to IOs which have not reported completion before
blkdev_issue_flush() was called. Specifically, CFQ will queue submitted bio
in its internal RB tree, following flush request completely bypasses this
tree and goes directly to the disk where it flushes caches. And only later
CFQ decides to schedule async writeback from the flusher thread which is
queued in the RB tree...

Note that the behavior has changed to be like this with the flushing
machinery rewrite. Before that, IO scheduler had to drain all the
outstanding IO requests (IO cache flush behaved like IO barrier). So your
patch would be enough with the old flushing machinery but is not enough
since 3.0 or so...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ