lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151130134537.GC4522@quack.suse.cz>
Date:	Mon, 30 Nov 2015 14:45:37 +0100
From:	Jan Kara <jack@...e.cz>
To:	Daeho Jeong <daeho.jeong@...sung.com>
Cc:	tytso@....edu, linux-ext4@...r.kernel.org
Subject: Re: [PATCH 1/3] ext4: handle unwritten or delalloc buffers before
 enabling per-file data journaling

On Wed 18-11-15 10:34:32, Daeho Jeong wrote:
> We already allocate delalloc blocks before changing the inode mode into
> "per-file data journal" mode to prevent delalloc blocks from remaining
> not allocated, but another issue concerned with "BH_Unwritten" status
> still exists. For example, by fallocate(), several buffers' status
> change into "BH_Unwritten", but these buffers cannot be processed by
> ext4_alloc_da_blocks(). So, they still remain in unwritten status after
> per-file data journaling is enabled and they cannot be changed into
> written status any more and, if they are journaled and eventually
> checkpointed, these unwritten buffer will cause a kernel panic by the
> below BUG_ON() function of submit_bh_wbc() when they are submitted
> during checkpointing.
> 
> static int submit_bh_wbc(int rw, struct buffer_head *bh,...
> {
>         ...
>         BUG_ON(buffer_unwritten(bh));
> 
> Moreover, when "dioread_nolock" option is enabled, the status of a
> buffer is changed into "BH_Unwritten" after write_begin() completes and
> the "BH_Unwritten" status will be cleared after I/O is done. Therefore,
> if a buffer's status is changed into unwrutten but the buffer's I/O is
> not submitted and completed, it can cause the same problem after
> enabling per-file data journaling. You can easily generate this bug by
> executing the following command.
> 
> ./kvm-xfstests -C 10000 -m nodelalloc,dioread_nolock generic/269
> 
> To resolve these problems and define a boundary between the previous
> mode and per-file data journaling mode, we need to flush and wait all
> the I/O of buffers of a file before enabling per-file data journaling
> of the file.


> 
> Signed-off-by: Daeho Jeong <daeho.jeong@...sung.com>
> ---
>  fs/ext4/inode.c |    9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 612fbcf..1f9458e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5168,9 +5168,14 @@ int ext4_change_inode_journal_flag(struct inode *inode, int val)
>  	 * be allocated any more. even more truncate on delalloc blocks
>  	 * could trigger BUG by flushing delalloc blocks in journal.
>  	 * There is no delalloc block in non-journal data mode.
> +	 * We also have to handle unwritten buffers generated by
> +	 * fallocate() and dioread_nolock option. Once per-file data
> +	 * journaling is enabled, unwritten buffers will remain in
> +	 * unwritten status forever and they will be the seeds of
> +	 * kernel panic when they are checkpointed.

Can we maybe rephrase the whole comment like:

	/* 
	 * Before flushing the journal and switching inode's aops, we have
	 * to flush all dirty data the inode has. There can be outstanding
	 * delayed allocations, there can be unwritten extents created by
	 * fallocate or buffered writes in dioread_nolock mode covered by
	 * dirty data which can be converted only after flushing the dirty
	 * data (and journalled aops don't know how to handle these cases).
	 */

Otherwise the patch looks good to me. So after updating the comment you can
add:

Reviewed-by: Jan Kara <jack@...e.cz>

								Honza

> -	if (val && test_opt(inode->i_sb, DELALLOC)) {
> -		err = ext4_alloc_da_blocks(inode);
> +	if (val) {
> +		err = filemap_write_and_wait(inode->i_mapping);
>  		if (err < 0)
>  			return err;
>  	}
> -- 
> 1.7.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ