linux-ext4 - Re: [PATCH 1/3] ext4: fix COLLAPSE_RANGE failure issue on ext4 with 1KB block size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.1404171017220.2143@localhost.localdomain>
Date:	Thu, 17 Apr 2014 10:25:56 +0200 (CEST)
From:	Lukáš Czerner <lczerner@...hat.com>
To:	Namjae Jeon <namjae.jeon@...sung.com>
cc:	"Theodore Ts'o" <tytso@....edu>,
	linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 1/3] ext4: fix COLLAPSE_RANGE failure issue on ext4 with
 1KB block size

On Thu, 17 Apr 2014, Namjae Jeon wrote:

> Date: Thu, 17 Apr 2014 07:27:45 +0900
> From: Namjae Jeon <namjae.jeon@...sung.com>
> To: Theodore Ts'o <tytso@....edu>
> Cc: linux-ext4 <linux-ext4@...r.kernel.org>,
>     'Lukáš Czerner' <lczerner@...hat.com>
> Subject: [PATCH 1/3] ext4: fix COLLAPSE_RANGE failure issue on ext4 with 1KB
>     block size
> 
> From: Namjae Jeon <namjae.jeon@...sung.com>
> 
> When formatting with 1KB or 2KB(not aligned with PAGE SIZE) block size,
> xfstests generic/075 and 091 are failing. The offset supplied to function
> truncate_pagecache_range is block size aligned. In this function start offset
> is re-aligned to PAGE_SIZE by rounding_up to the next page boundary.
> Due to this rounding up, old data remains in the page cache when blocksize is
> less than page size and start offset is not aligned with page size.
> In case of collapse range, we need to align start offset to page size boundary
> by doing a round down operation instead of round up.

Great, thanks for finding it.

> 
> Signed-off-by: Namjae Jeon <namjae.jeon@...sung.com>
> Signed-off-by: Ashish Sangwan <a.sangwan@...sung.com>
> ---
>  fs/ext4/extents.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 1bb3e4b..f386dd6 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -5404,8 +5404,8 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
>  	struct super_block *sb = inode->i_sb;
>  	ext4_lblk_t punch_start, punch_stop;
>  	handle_t *handle;
> -	unsigned int credits;
> -	loff_t new_size;
> +	unsigned int credits, rounding;
> +	loff_t new_size, ioffset;
>  	int ret;
>  
>  	/* Collapse range works only on fs block size aligned offsets. */
> @@ -5428,8 +5428,11 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
>  			return ret;
>  	}
>  
> +	rounding = max_t(uint, 1 << EXT4_BLOCK_SIZE_BITS(sb), PAGE_CACHE_SIZE);
> +	ioffset = offset & ~(rounding - 1);

That looks like you're expecting that block size might be bigger
than page size. That's definitely not the case at the moment as we
can not have block size > page size. There is a discussion to
support this in the future, but even when the infrastructure is done
we would have to revisit the code anyway. So I do not think this is
needed. Just always round it down to PAGE_SIZE (since that's what
truncate_pagecache is actually using)

> +
>  	/* Write out all dirty pages */
> -	ret = filemap_write_and_wait_range(inode->i_mapping, offset, -1);
> +	ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, -1);
>  	if (ret)
>  		return ret;
>  
> @@ -5451,7 +5454,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
>  		goto out_mutex;
>  	}
>  
> -	truncate_pagecache_range(inode, offset, -1);
> +	truncate_pagecache_range(inode, ioffset, -1);

As I mentioned in a different email we can just use
truncate_pagecache(). In fact we should use it because we want to
remove private COWed pages as well I think.

Thanks!
-Lukas

>  
>  	/* Wait for existing dio to complete */
>  	ext4_inode_block_unlocked_dio(inode);
>