lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 16 Aug 2008 09:53:34 +0530
From:	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>
To:	Mingming Cao <cmm@...ibm.com>
Cc:	tytso <tytso@....edu>, linux-ext4@...r.kernel.org,
	Andreas Dilger <adilger@....com>
Subject: Re: [PATCH 5/6  V2]Ext4 journal credits  fixes for delalloc
	writepages

On Fri, Aug 15, 2008 at 05:40:58PM -0700, Mingming Cao wrote:
> Ext4: journal credit fix the delalloc writepages
> 
> From: Mingming Cao <cmm@...ibm.com>
> 
> Previous delalloc writepages implementation start a new transaction outside
> a loop call of get_block() to do the block allocation. Due to lack of
> information of how many blocks to be allocated, the estimate of the journal
> credits is very conservtive and caused many issues.
> 
> With the rewored delayed allocation, a new transaction is created for
> each get_block(), thus we don't need to guess how many credits for the multiple
> chunk of allocation. Start every transaction with credits for insert a
> single exent is enough. But we still need to consider the journalled mode,
> where it need to account for the number of data blocks.  So we guess
> max number of data blocks for each allocation.  Due to the current VFS
> implementation writepages() could only flush PAGEVEC lengh of pages at a
> time, the max block allocation is limited and calculated based on
> that, and the total number of reserved delalloc datablocks, whichever
> is smaller.


Need to update the comment. 


> 
> Signed-off-by: Mingming Cao <cmm@...ibm.com>
> ---
>  fs/ext4/inode.c |   42 +++++++++++++++++++++++++++---------------
>  1 file changed, 27 insertions(+), 15 deletions(-)
> 
> Index: linux-2.6.27-rc3/fs/ext4/inode.c
> ===================================================================
> --- linux-2.6.27-rc3.orig/fs/ext4/inode.c	2008-08-15 14:51:22.000000000 -0700
> +++ linux-2.6.27-rc3/fs/ext4/inode.c	2008-08-15 17:18:09.000000000 -0700
> @@ -1850,8 +1850,11 @@ static void mpage_add_bh_to_extent(struc
>  {
>  	struct buffer_head *lbh = &mpd->lbh;
>  	sector_t next;
> +	int nrblocks = lbh->b_size >> mpd->inode->i_blkbits;
> 
> -	next = lbh->b_blocknr + (lbh->b_size >> mpd->inode->i_blkbits);
> +	/* check if thereserved journal credits might overflow */
> +	if (nrblocks >EXT4_MAX_TRANS_DATA)
> +		goto flush_it;

Since we don't support data=journal I am not sure whether we should
limit nrblocks. Also limiting to EXT4_MAX_TRANS_DATA = 64 blocks
may give highly fragmented files. May be we can do this only
for non extent files because only for no extents files we are 
dependent on the number of blocks for calculating credits even
if we know that we are going to insert a contiguous chunk.

> 
>  	/*
>  	 * First block in the extent
> @@ -1863,6 +1866,7 @@ static void mpage_add_bh_to_extent(struc
>  		return;
>  	}
> 
> +	next = lbh->b_blocknr + nrblocks;
>  	/*
>  	 * Can we merge the block to our big extent?
>  	 */
> @@ -1871,6 +1875,7 @@ static void mpage_add_bh_to_extent(struc
>  		return;
>  	}
> 
> +flush_it:
>  	/*
>  	 * We couldn't merge the block to our extent, so we
>  	 * need to flush current  extent and start new one
> @@ -2231,17 +2236,26 @@ static int ext4_da_writepage(struct page
>  }
> 
>  /*
> - * For now just follow the DIO way to estimate the max credits
> - * needed to write out EXT4_MAX_WRITEBACK_PAGES.
> - * todo: need to calculate the max credits need for
> - * extent based files, currently the DIO credits is based on
> - * indirect-blocks mapping way.
> + * This is called via ext4_da_writepages() to
> + * calulate the total number of credits to reserve to fit
> + * a single extent allocation into a single transaction,
> + * ext4_da_writpeages() will loop calling this before
> + * the block allocation.
>   *
> - * Probably should have a generic way to calculate credits
> - * for DIO, writepages, and truncate
> + * The page vector size limited the max number of pages could
> + * be writeout at a time. Based on this, the max blocks to pass to
> + * get_block is calculated
>   */
> -#define EXT4_MAX_WRITEBACK_PAGES      DIO_MAX_BLOCKS
> -#define EXT4_MAX_WRITEBACK_CREDITS    25
> +
> +static int ext4_writepages_trans_blocks(struct inode *inode)
> +{
> +	int max_blocks = EXT4_MAX_TRANS_DATA;
> +
> +	if (max_blocks > EXT4_I(inode)->i_reserved_data_blocks)
> +		max_blocks =  EXT4_I(inode)->i_reserved_data_blocks;
> +
> +	return ext4_chunk_trans_blocks(inode, max_blocks);
> +}
> 
>  static int ext4_da_writepages(struct address_space *mapping,
>                                  struct writeback_control *wbc)
> @@ -2283,7 +2297,7 @@ restart_loop:
>  		 * by delalloc
>  		 */
>  		BUG_ON(ext4_should_journal_data(inode));
> -		needed_blocks = EXT4_DATA_TRANS_BLOCKS(inode->i_sb);
> +		needed_blocks = ext4_writepages_trans_blocks(inode);
> 
>  		/* start a new transaction*/
>  		handle = ext4_journal_start(inode, needed_blocks);
> @@ -4462,11 +4476,9 @@ int ext4_meta_trans_blocks(struct inode*
>   * the modification of a single pages into a single transaction,
>   * which may include multile chunk of block allocations.
>   *
> - * This could be called via ext4_write_begin() or later
> - * ext4_da_writepages() in delalyed allocation case.
> + * This could be called via ext4_write_begin()
>   *
> - * In both case it's possible that we could allocating multiple
> - * chunks of blocks. We need to consider the worse case, when
> + * We need to consider the worse case, when
>   * one new block per extent.
>   */
>  int ext4_writepage_trans_blocks(struct inode *inode)
> 
> 

-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ