lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 28 Aug 2009 17:40:54 -0700
From:	Jiaying Zhang <jiayingz@...gle.com>
To:	Andreas Dilger <adilger@....com>
Cc:	Frank Mayhar <fmayhar@...gle.com>,
	Eric Sandeen <sandeen@...hat.com>,
	Curt Wohlgemuth <curtw@...gle.com>,
	ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: Question on fallocate/ftruncate sequence

On Fri, Aug 28, 2009 at 3:14 PM, Andreas Dilger<adilger@....com> wrote:
> On Aug 28, 2009  14:44 -0700, Jiaying Zhang wrote:
>> On Fri, Aug 28, 2009 at 12:40 PM, Andreas Dilger<adilger@....com> wrote:
>> > This isn't really correct, however, because i_blocks also contains
>> > non-data blocks (indirect/index, EA, etc) blocks, so even with small
>> > files with ACLs i_blocks may always be larger than ia_size >> 9, and
>> > for ext2/3 at least this will ALWAYS be true for files > 48kB in size.
>>
>> I see. I guess we need to use a special flag then. Or is there any
>> other suggestions? I also have another question related to this
>> problem. Why those fallocated blocks are not marked as preallocated
>> blocks that will then be automatically freed in ext4_release_file?
>
> Because fallocate() means "persistent allocation on disk", not "in memory
> preallocation".  The "in memory" preallocation already happens in ext4,
> and it is released when the inode is cleaned up.

Right. Thanks for pointing this out!

RFC, here is a patch that Frank and I have been working on. It introduces
a new fs flag to mark that the file has been allocated beyond its EOF, as
discussed previously in this thread. The flag is cleared in the subsequent
vmtruncate or fallocate without KEEPSIZE. It is possible that a vmtruncate
may be called unnecessarily in the case that the file is written beyond the
allocated size, but I think it is ok to pay this cost to get correctness.

--- .pc/fallocate_keepsizse.patch/fs/attr.c	2009-08-28 15:38:46.000000000 -0700
+++ fs/attr.c	2009-08-28 17:01:04.000000000 -0700
@@ -68,7 +68,8 @@ int inode_setattr(struct inode * inode,
 	unsigned int ia_valid = attr->ia_valid;

 	if (ia_valid & ATTR_SIZE &&
-	    (attr->ia_size != i_size_read(inode)) {
+	    (attr->ia_size != i_size_read(inode) ||
+	     (inode->i_flags & FS_KEEPSIZE_FL))) {
 		int error = vmtruncate(inode, attr->ia_size);
 		if (error)
 			return error;
--- .pc/fallocate_keepsizse.patch/fs/ext4/extents.c	2009-08-28
15:37:45.000000000 -0700
+++ fs/ext4/extents.c	2009-08-28 17:27:27.000000000 -0700
@@ -3095,7 +3095,13 @@ static void ext4_falloc_update_inode(str
 			i_size_write(inode, new_size);
 		if (new_size > EXT4_I(inode)->i_disksize)
 			ext4_update_i_disksize(inode, new_size);
+		inode->i_flags &= ~FS_KEEPSIZE_FL;
 	} else {
+		/*
+		 * Mark that we allocate beyond EOF so the subsequent truncate
+		 * can proceed even if the new size is the same as i_size.
+		 */
+		inode->i_flags |= FS_KEEPSIZE_FL;
 	}
 }

--- .pc/fallocate_keepsizse.patch/fs/ext4/inode.c	2009-08-16
14:19:38.000000000 -0700
+++ fs/ext4/inode.c	2009-08-28 16:59:42.000000000 -0700
@@ -3973,6 +3973,8 @@ void ext4_truncate(struct inode *inode)
 	if (!ext4_can_truncate(inode))
 		return;

+	inode->i_flags &= ~FS_KEEPSIZE_FL;
+
 	if (inode->i_size == 0 && !test_opt(inode->i_sb, NO_AUTO_DA_ALLOC))
 		ei->i_state |= EXT4_STATE_DA_ALLOC_CLOSE;

--- .pc/fallocate_keepsizse.patch/include/linux/fs.h	2009-08-28
15:44:27.000000000 -0700
+++ include/linux/fs.h	2009-08-28 17:00:47.000000000 -0700
@@ -343,6 +343,7 @@ struct inodes_stat_t {
 #define FS_TOPDIR_FL			0x00020000 /* Top of directory hierarchies*/
 #define FS_EXTENT_FL			0x00080000 /* Extents */
 #define FS_DIRECTIO_FL			0x00100000 /* Use direct i/o */
+#define FS_KEEPSIZE_FL			0x00200000 /* Blocks allocated beyond EOF */
 #define FS_RESERVED_FL			0x80000000 /* reserved for ext2 lib */

 #define FS_FL_USER_VISIBLE		0x0003DFFF /* User visible flags */

Jiaying

>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ