lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110518061356.GY19446@dastard>
Date:	Wed, 18 May 2011 16:13:56 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Eric Sandeen <sandeen@...hat.com>
Cc:	Jiaying Zhang <jiayingz@...gle.com>, tytso@....edu,
	linux-ext4@...r.kernel.org
Subject: Re: [PATCH] ext4: use vmtruncate() instead of ext4_truncate() in
 ext4_setattr()

On Tue, May 17, 2011 at 10:19:05PM -0500, Eric Sandeen wrote:
> On 5/17/11 5:59 PM, Jiaying Zhang wrote:
> > There is a bug in commit c8d46e41 "ext4: Add flag to files with blocks
> > intentionally past EOF" that if we fallocate a file with FALLOC_FL_KEEP_SIZE
> > flag and then ftruncate the file to a size larger than the file's i_size,
> > any allocated but unwritten blocks will be freed but the file size is set
> > to the size that ftruncate specifies.
> > 
> > Here is a simple test to reproduce the problem:
> >   1. fallocate a 12k size file with KEEP_SIZE flag
> >   2. write the first 4k
> >   3. ftruncate the file to 8k
> > Then 'ls -l' shows that the i_size of the file becomes 8k but debugfs
> > shows the file has only the first written block left.
> 
> To be honest I'm not 100% certain what the fiesystem -should- do in this case.
> 
> If I go through that same sequence on xfs, I get 4k written / 8k unwritten:
> 
> # xfs_bmap -vp testfile
> testfile:
>  EXT: FILE-OFFSET      BLOCK-RANGE            AG AG-OFFSET              TOTAL FLAGS
>    0: [0..7]:          2648750760..2648750767  3 (356066400..356066407)     8 00000
>    1: [8..23]:         2648750768..2648750783  3 (356066408..356066423)    16 10000

Ok, so that's the case for a _truncate up_ from 4k to 8k:

$ rm /mnt/test/foo
$ xfs_io -f -c "resvsp 0 12k" -c stat -c "bmap -vp" -c "pwrite 0 4k" -c "fsync" -c "bmap -vp" -c "t 8k" -c "bmap -vp" -c stat /mnt/test/foo
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 0
stat.blocks = 24
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 1
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..23]:         9712..9735        0 (9712..9735)        24 10000
wrote 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0000 sec (156 MiB/sec and 40000.0000 ops/sec)
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9712..9719        0 (9712..9719)         8 00000
   1: [8..23]:         9720..9735        0 (9720..9735)        16 10000
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9712..9719        0 (9712..9719)         8 00000
   1: [8..23]:         9720..9735        0 (9720..9735)        16 10000
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 8192
stat.blocks = 24
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 2
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136

But you get a different result on truncate down:

$rm /mnt/test/foo
$ xfs_io -f -c "truncate 12k" -c "resvsp 0 12k" -c stat -c "bmap -vp" -c "pwrite 0 4k" -c "fsync" -c "bmap -vp" -c "t 8k" -c "bmap -vp" -c stat /mnt/test/foo
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 12288
stat.blocks = 24
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 1
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..23]:         9584..9607        0 (9584..9607)        24 10000
wrote 4096/4096 bytes at offset 0
4 KiB, 1 ops; 0.0000 sec (217.014 MiB/sec and 55555.5556 ops/sec)
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9584..9591        0 (9584..9591)         8 00000
   1: [8..23]:         9592..9607        0 (9592..9607)        16 10000
/mnt/test/foo:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
   0: [0..7]:          9584..9591        0 (9584..9591)         8 00000
   1: [8..15]:         9592..9599        0 (9592..9599)         8 10000
fd.path = "/mnt/test/foo"
fd.flags = non-sync,non-direct,read-write
stat.ino = 71
stat.type = regular file
stat.size = 8192
stat.blocks = 16
fsxattr.xflags = 0x2 [-p------------]
fsxattr.projid = 0
fsxattr.extsize = 0
fsxattr.nextents = 2
fsxattr.naextents = 0
dioattr.mem = 0x200
dioattr.miniosz = 512
dioattr.maxiosz = 2147483136

IOWs, on XFS a truncate up does not change the preallocation at all,
while a truncate down will _always_ remove preallocation beyond the
new EOF.  It's always had this behaviour w.r.t. to truncate(2) and
preallocation beyond EOF.

> I think this is a different result from ext4, either with or without your patch.
> 
> On ext4 I get size 8k, but only the first 4k mapped, as you say.
> 
> I don't recall when truncate is supposed to free fallocated blocks, and from what point?

It's entirely up to the filesystem how it treats blocks beyond EOF
during truncation. XFS frees them on truncate down, because it is
much safer to just truncate away everything beyond the new EOF than
to leave written extents beyond EOF as potential landmines.

Indeed, that's why calling vmtruncate() as a bad fix. If you have:


	       NUUUUUUUUUUWWWWWWWWWOUUUUUUUUU
       ....----+----------+--------+--------+
               A	  B        C        D

Where	A = new EOF (N)
	A->B = unwritten (U)
	B->C = written (W)
	C = old EOF (O)
	C->D = unwritten (U)

Then just calling vmtruncate() will leave the blocks in the range
B->C as written blocks. Hence then doing an extending truncate back
out to D will expose stale data rather than zeros in the range
B->C....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ