lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <959E4E25EAEC544D31199E6F@nimrod.local>
Date:	Sun, 22 May 2011 20:11:08 +0100
From:	Alex Bligh <alex@...x.org.uk>
To:	linux-kernel@...r.kernel.org,
	Christoph Hellwig <hch@...radead.org>, Jan Kara <jack@...e.cz>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	Theodore Ts'o <tytso@....edu>
cc:	Alex Bligh <alex@...x.org.uk>
Subject: BUG: Failure to send REQ_FLUSH on unmount on ext3, ext4, and FS in
 general

I have been doing some testing to see what file systems successfully send
REQ_FLUSH after all writes to the file system in the case of an unmount.

Results so far:
 1. ext2, ext3 (with default options), never send REQ_FLUSH
 2. ext3 (with barrier=1) and ext4 do send REQ_FLUSH but then
    send further writes afterwards.
 3. btrfs and xfs do things right (i.e. either end with a REQ_FLUSH in
    xfs's case, or a REQ_FLUSH and a REQ_FUA in btrfs's case)

So the first bug is that ext3 and ext4 appear to send writes (without a
subsequent flush/fia) before an unmount, and thus will never fully
flush a write-behind cache. They look like this:

But quite aside from the question of whether the FS supports barriers,
should the kernel itself (rather than the FS) not be sending REQ_FLUSH on
an unmount as the last thing that happens? IE shouldn't we see a flush
even on (say) ext2 which is never going to support barriers. If the kernel
itself generated a REQ_FLUSH for the block device, this would keep
filesystems that don't support barriers safe provided the unmount
completed successfully and would have no impact on ones that had already
flushed the write-behind cache.

I have been using an instrumented version of nbd to test this (see
git.alex.org.uk). nbd in this instance is patched to support REQ_FLUSH
and REQ_FUA.

Trace from ext3 below (ext4 is similar)

-- 
Alex Bligh

> H=10ee1e1b0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=0000000002529000 
L=00000400
> H=00d00b1f0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=0000000002531000 
L=00000400
> H=082714110088ffff C=0x00000003 (NBD_CMD_FLUSH+NONE) O=0000000000000000 
L=00000000
> H=68d10b1f0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=0000000002544400 
L=00000400
> H=d0d20b1f0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=0000000002564400 
L=00000400
> H=082714110088ffff C=0x00010001 (NBD_CMD_WRITE+ FUA) O=000000000112cc00 
L=00000400
> H=d052c31a0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=000000000103a000 
L=00000400
> H=d052c31a0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=000000000103a000 
L=00000400
> H=d052c31a0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=000000000103a000 
L=00000400
> H=d052c31a0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=000000000103a000 
L=00000400
> H=d052c31a0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=000000000103a000 
L=00000400
> H=d052c31a0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=000000000103a000 
L=00000400
> H=d052c31a0088ffff C=0x00000001 (NBD_CMD_WRITE+NONE) O=0000000000000400 
L=00000400
> H=88dcdd1b0088ffff C=0x00000002 ( NBD_CMD_DISC+NONE) O=fffffffffffffe00 
L=00000000

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ