linux-kernel - Re: [PATCH 2/2] xfs: add FALLOC_FL_ZERO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120613021610.GQ22848@dastard>
Date:	Wed, 13 Jun 2012 12:16:10 +1000
From:	Dave Chinner <david@...morbit.com>
To:	Paolo Bonzini <pbonzini@...hat.com>
Cc:	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	Dave Chinner <dchinner@...hat.com>, xfs@....sgi.com
Subject: Re: [PATCH 2/2] xfs: add FALLOC_FL_ZERO_RANGE to fallocate

On Tue, Jun 12, 2012 at 05:36:04PM +0200, Paolo Bonzini wrote:
> This is implemented in the same way as the other fallocate modes.  All of
> them map to XFS ioctls that are implemented by xfs_change_file_space.
> 
> However, we need to cap the length to the inode size if the user requested
> FALLOC_FL_KEEP_SIZE.

That's done on purpose.  fallocate() explicitly allows preallocation
beyond EOF and that's what the FALLOC_FL_KEEP_SIZE flag is for - to
allow both offset and offset+len to lie beyond the current inode
size and have the preallocation succeed without changing the file
size.

This is so that we can prevent fragmentation of slow growing
append-only files like log files - we can preallocate way beyond EOF
without changing EOF so as the file grows over days and weeks it
does not fragment.

Similarly, hole punch needs to be able to punch out such
preallocated extents beyond EOF if requested, and it definitely must
not change EOF. So capping/erroring out when offset/offset+len is
definitely the wrong thing to do when FALLOC_FL_KEEP_SIZE is set.

> Cc: Dave Chinner <dchinner@...hat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> ---
>  fs/xfs/xfs_file.c |   36 ++++++++++++++++++++++++------------
>  1 files changed, 24 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 9f7ec15..c811cf9 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -818,33 +818,45 @@ xfs_file_fallocate(
>  	loff_t		new_size = 0;
>  	xfs_flock64_t	bf;
>  	xfs_inode_t	*ip = XFS_I(inode);
> -	int		cmd = XFS_IOC_RESVSP;
> +	int		cmd;
>  	int		attr_flags = XFS_ATTR_NOLOCK;
>  
> -	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
> +	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
> +		     FALLOC_FL_ZERO_RANGE))
>  		return -EOPNOTSUPP;
>  
> +	BUG_ON((mode & FALLOC_FL_PUNCH_HOLE) && (mode & FALLOC_FL_ZERO_RANGE));

Never put BUG_ON() or BUG() in XFS code that can return an error.
Return EINVAL if we chose not to support it, and if it's really
something we consider bad, emit a warning to syslog (i.e.
xfs_warn()) and potentially add a ASSERT() case so that debug
kernels will trip over it. Nobody should be panicing a production
system just because a user supplied a set of incorrect syscall
paramters....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/