lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1217417361.3373.15.camel@localhost>
Date:	Wed, 30 Jul 2008 13:29:21 +0200
From:	Frédéric Bohé <frederic.bohe@...l.net>
To:	Mingming Cao <cmm@...ibm.com>
Cc:	tytso <tytso@....edu>, Shehjar Tikoo <shehjart@....unsw.edu.au>,
	linux-ext4@...r.kernel.org,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	Andreas Dilger <adilger@....com>
Subject: Re: [PATCH v3]Ext4: journal credits reservation fixes for DIO,
	fallocate and delalloc writepages

While doing some perf test on flex bg, I tried to run bonnie++ on
2.6.27-rc1 + patch queue including your journal credit fix but I had a
very similar crash. Here are the details, I hope this help :

kernel 2.6.27-rc1
patch queue snapshot :
ext4-patch-queue-25fb9834f3814b3aa567c5af090fba688a86eea9

With latest e2fsprogs :
mkfs.ext4 -t ext4dev -b1024 -G256 /dev/sdb1 4G
mount -t ext4dev /dev/sdb1 /mnt/test
bonnie++ -u root -s 2g:256 -r 1024 -n 200 -d /mnt/test/

after a while, it ends up with :

kernel BUG at fs/jbd2/transaction.c:984!
invalid opcode: 0000 [#1] SMP 
Modules linked in: ext4dev jbd2 crc16 kvm_intel kvm megaraid_mbox
megaraid_mm

Pid: 13965, comm: bonnie++ Not tainted (2.6.27-rc1 #3)
EIP: 0060:[<f8b186a6>] EFLAGS: 00010246 CPU: 4
EIP is at jbd2_journal_dirty_metadata+0xc6/0xd0 [jbd2]
EAX: 00000000 EBX: f0acc380 ECX: f0acc380 EDX: f0069f80
ESI: f3964700 EDI: f5daa1b0 EBP: f6dd7e00 ESP: f5949ebc
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process bonnie++ (pid: 13965, ti=f5948000 task=f5404ba0
task.ti=f5948000)
Stack: f7cb0100 f5daa1b0 f0acc380 f8b8ca12 f8b7ef62 f7cb0000 f68a5d00
f7cb0100 
       00000000 f7183e00 f5daa1b0 f8b6a06e 00000040 f8b736db f7cb2134
f2c94238 
       0000000b 00000000 00008000 00000000 f0acc380 f7cb0000 f08b2ac0
f2c942c8 
Call Trace:
 [<f8b7ef62>] __ext4_journal_dirty_metadata+0x22/0x60 [ext4dev]
 [<f8b6a06e>] ext4_free_inode+0x26e/0x2f0 [ext4dev]
 [<f8b736db>] ext4_orphan_del+0xcb/0x180 [ext4dev]
 [<f8b6fb3c>] ext4_delete_inode+0x11c/0x140 [ext4dev]
 [<f8b6fa20>] ext4_delete_inode+0x0/0x140 [ext4dev]
 [<c018fe6a>] generic_delete_inode+0x5a/0xc0
 [<c018f4a4>] iput+0x44/0x50
 [<c0186271>] do_unlinkat+0xd1/0x150
 [<c017cdd6>] vfs_write+0x106/0x140
 [<c02aa7b0>] tty_write+0x0/0x1e0
 [<c017d2d1>] sys_write+0x41/0x70
 [<c0102fc9>] sysenter_do_call+0x12/0x25
 =======================
Code: 55 2c 8d 76 00 74 aa 0f 0b eb fe 0f 0b eb fe 8d b6 00 00 00 00 0f
0b eb fe f6 43 02 20 0f 84 5d ff ff ff f3 90 eb f2 0f 0b eb fe <0f> 0b
eb fe 8d b6 00 00 00 00 55 57 56 53 89 d3 83 ec 10 89 44 
EIP: [<f8b186a6>] jbd2_journal_dirty_metadata+0xc6/0xd0 [jbd2] SS:ESP
0068:f5949ebc


Fred



Le mardi 29 juillet 2008 à 18:58 -0700, Mingming Cao a écrit :
> Ext4: journal credits reservation fixes for DIO, fallocate and delalloc writepages
> 
> From: Mingming Cao <cmm@...ibm.com>
> 
> With delalloc, at writepages() time, we need to reserve enough credits to start
> a new handle, to allow possible multiple segment of block allocations under a
> single call mapge_da_writepages(), to fit metadata updates into the single
> transaction. This patch fixed this by calculating the needed credits for
> write-out given number of dirty pages, with the consideration of discontinues
> block allocations. It fixed both extent files and non extent files.
> 
> This patch also fixed the journal credit reservation for DIO. Currently the
> estimated credits for DIO is only based on non extent format file. That credit
> is not enough for mballoc a single extent on extent based file. This patch
> fixed that.
> 
> The fallocate double booking credits for modifying super block etc, this patch
> fixed that.
> 
> This also fix credit reservation in migration and defrag code.
> 
> 
> Changes since v2:
> 
> 1) fix  writepages() inefficency issue. sync() will invoke writepages()
> twice( not sure exactly why), the second time all the pages are clean so
> it waste the cpu time to walk though all pages and find they are not
> dirty . But  it's simple to workaround by skip writepages() if there is
> no dirty pages pointed by the mapping.
> 
> 
> 2) extent based credit calculate is quit conservetive. It always use the
> max possible depth to estimate the needed credits to support extent
> insert/tree split. In fact the depth info for each inode is quite easy
> to get, so we could use more accurate info to calculate
> 
> 3) Limit the max number of pages that could  flush at once from
> ext4_da_writepages(), so that the max possible transaction credits could
> fit under the  allowed credits for starting a  new transaction.  Reduce
> the number of pages to flush  if necesary.   Currently with 4K page size
> and 4K block size, with extent file, it's possible to flush about 1K
> pages under a single transaction.
> 
> 
> Verified with memory pressure case and umount case,
> 
> Signed-off-by: Mingming Cao <cmm@...ibm.com>
> ---
>  fs/ext4/ext4.h         |    4 -
>  fs/ext4/ext4_extents.h |    3 -
>  fs/ext4/ext4_jbd2.h    |   10 ++++
>  fs/ext4/extents.c      |   78 ++++++++++++++++++-------------
>  fs/ext4/inode.c        |  120 ++++++++++++++++++++++++++-----------------------
>  fs/ext4/migrate.c      |    6 +-
>  6 files changed, 129 insertions(+), 92 deletions(-)
> 
> Index: linux-2.6.26git6/fs/ext4/ext4.h
> ===================================================================
> --- linux-2.6.26git6.orig/fs/ext4/ext4.h	2008-07-28 22:47:22.000000000 -0700
> +++ linux-2.6.26git6/fs/ext4/ext4.h	2008-07-29 17:40:40.000000000 -0700
> @@ -1072,7 +1072,7 @@ extern void ext4_truncate (struct inode 
>  extern void ext4_set_inode_flags(struct inode *);
>  extern void ext4_get_inode_flags(struct ext4_inode_info *);
>  extern void ext4_set_aops(struct inode *inode);
> -extern int ext4_writepage_trans_blocks(struct inode *);
> +extern int ext4_writepages_trans_blocks(struct inode *, int nrpages);
>  extern int ext4_block_truncate_page(handle_t *handle,
>  		struct address_space *mapping, loff_t from);
>  extern int ext4_page_mkwrite(struct vm_area_struct *vma, struct page *page);
> @@ -1227,7 +1227,7 @@ extern const struct inode_operations ext
>  
>  /* extents.c */
>  extern int ext4_ext_tree_init(handle_t *handle, struct inode *);
> -extern int ext4_ext_writepage_trans_blocks(struct inode *, int);
> +extern int ext4_ext_writeblocks_trans_credits(struct inode *inode, int);
>  extern int ext4_ext_get_blocks(handle_t *handle, struct inode *inode,
>  			ext4_lblk_t iblock,
>  			unsigned long max_blocks, struct buffer_head *bh_result,
> Index: linux-2.6.26git6/fs/ext4/extents.c
> ===================================================================
> --- linux-2.6.26git6.orig/fs/ext4/extents.c	2008-07-28 22:53:20.000000000 -0700
> +++ linux-2.6.26git6/fs/ext4/extents.c	2008-07-29 17:40:50.000000000 -0700
> @@ -1747,34 +1747,43 @@ static int ext4_ext_rm_idx(handle_t *han
>  }
>  
>  /*
> - * ext4_ext_calc_credits_for_insert:
> - * This routine returns max. credits that the extent tree can consume.
> + * ext4_ext_calc_credits_for_single_extent:
> + * This routine returns max. credits that needed to insert an extent
> + * to the extent tree.
>   * It should be OK for low-performance paths like ->writepage()
>   * To allow many writing processes to fit into a single transaction,
> - * the caller should calculate credits under i_data_sem and
> - * pass the actual path.
> + * When pass the actual path, the caller should calculate credits
> + * under i_data_sem.
> + *
> + * For inserting a single extent, in the worse case extent tree depth is 5
> + * for old tree and new tree, for every level we need to reserve
> + * credits to log the bitmap and block group descriptors
> + *
> + * credit needed for the update of super block + inode block + quota files
> + * are not included here. The caller of this function need to take care of this.
>   */
> -int ext4_ext_calc_credits_for_insert(struct inode *inode,
> +int ext4_ext_calc_credits_for_single_extent(struct inode *inode,
>  						struct ext4_ext_path *path)
>  {
>  	int depth, needed;
>  
> +	depth = ext_depth(inode);
> +
>  	if (path) {
>  		/* probably there is space in leaf? */
> -		depth = ext_depth(inode);
>  		if (le16_to_cpu(path[depth].p_hdr->eh_entries)
>  				< le16_to_cpu(path[depth].p_hdr->eh_max))
> -			return 1;
> +			/* 1 for block bitmap, 1 for group descriptor */
> +			return 2;
>  	}
>  
> -	/*
> -	 * given 32-bit logical block (4294967296 blocks), max. tree
> -	 * can be 4 levels in depth -- 4 * 340^4 == 53453440000.
> -	 * Let's also add one more level for imbalance.
> -	 */
> -	depth = 5;
> +	/* add one more level in case of tree increase when insert a extent */
> +	depth += 1;
>  
> -	/* allocation of new data block(s) */
> +	/*
> +	 * bitmap blocks and group descriptor block for
> + 	 * allocation of new extent
> + 	 */
>  	needed = 2;
>  
>  	/*
> @@ -1791,9 +1800,6 @@ int ext4_ext_calc_credits_for_insert(str
>  	 */
>  	needed += (depth * 2) + (depth * 2);
>  
> -	/* any allocation modifies superblock */
> -	needed += 1;
> -
>  	return needed;
>  }
>  
> @@ -1917,9 +1923,7 @@ ext4_ext_rm_leaf(handle_t *handle, struc
>  			correct_index = 1;
>  			credits += (ext_depth(inode)) + 1;
>  		}
> -#ifdef CONFIG_QUOTA
>  		credits += 2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
> -#endif
>  
>  		err = ext4_ext_journal_restart(handle, credits);
>  		if (err)
> @@ -2801,8 +2805,8 @@ void ext4_ext_truncate(struct inode *ino
>  	/*
>  	 * probably first extent we're gonna free will be last in block
>  	 */
> -	err = ext4_writepage_trans_blocks(inode) + 3;
> -	handle = ext4_journal_start(inode, err);
> +	handle = ext4_journal_start(inode,
> +				    ext4_writepages_trans_blocks(inode, 1) + 3);
>  	if (IS_ERR(handle))
>  		return;
>  
> @@ -2855,22 +2859,32 @@ out_stop:
>  }
>  
>  /*
> - * ext4_ext_writepage_trans_blocks:
> + * ext4_ext_writeblocks_trans_credits:
>   * calculate max number of blocks we could modify
> - * in order to allocate new block for an inode
> + * in order to allocate the required number of new blocks
> + *
> + * In the worse case, one block per extent.
> + *
>   */
> -int ext4_ext_writepage_trans_blocks(struct inode *inode, int num)
> +int  ext4_ext_writeblocks_trans_credits(struct inode *inode, int nrblocks)
>  {
>  	int needed;
>  
> -	needed = ext4_ext_calc_credits_for_insert(inode, NULL);
> -
> -	/* caller wants to allocate num blocks, but note it includes sb */
> -	needed = needed * num - (num - 1);
> +	/* cost of adding a single extent:
> +	 * index blocks, leafs, bitmaps,
> +	 * groupdescp
> +	 */
> +	needed = ext4_ext_calc_credits_for_single_extent(inode, NULL);
> +	/*
> +	 * For data=journalled mode need to account for the data blocks
> +	 * Also need to add super block and inode block
> +	 */
> +	if (ext4_should_journal_data(inode))
> +		needed = nrblocks * (needed + 1)  + 2;
> +	else
> +		needed = nrblocks * needed  + 2;
>  
> -#ifdef CONFIG_QUOTA
>  	needed += 2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
> -#endif
>  
>  	return needed;
>  }
> @@ -2935,10 +2949,9 @@ long ext4_fallocate(struct inode *inode,
>  	max_blocks = (EXT4_BLOCK_ALIGN(len + offset, blkbits) >> blkbits)
>  							- block;
>  	/*
> -	 * credits to insert 1 extent into extent tree + buffers to be able to
> -	 * modify 1 super block, 1 block bitmap and 1 group descriptor.
> +	 * credits to insert 1 extent into extent tree
>  	 */
> -	credits = EXT4_DATA_TRANS_BLOCKS(inode->i_sb) + 3;
> +	credits = EXT4_DATA_TRANS_BLOCKS(inode->i_sb);
>  	mutex_lock(&inode->i_mutex);
>  retry:
>  	while (ret >= 0 && ret < max_blocks) {
> Index: linux-2.6.26git6/fs/ext4/inode.c
> ===================================================================
> --- linux-2.6.26git6.orig/fs/ext4/inode.c	2008-07-28 22:53:21.000000000 -0700
> +++ linux-2.6.26git6/fs/ext4/inode.c	2008-07-29 17:45:43.000000000 -0700
> @@ -1,5 +1,5 @@
>  /*
> - *  linux/fs/ext4/inode.c
> + * linux/fs/ext4/inode.c
>   *
>   * Copyright (C) 1992, 1993, 1994, 1995
>   * Remy Card (card@...i.ibp.fr)
> @@ -954,15 +954,6 @@ out:
>  
>  /* Maximum number of blocks we map for direct IO at once. */
>  #define DIO_MAX_BLOCKS 4096
> -/*
> - * Number of credits we need for writing DIO_MAX_BLOCKS:
> - * We need sb + group descriptor + bitmap + inode -> 4
> - * For B blocks with A block pointers per block we need:
> - * 1 (triple ind.) + (B/A/A + 2) (doubly ind.) + (B/A + 2) (indirect).
> - * If we plug in 4096 for B and 256 for A (for 1KB block size), we get 25.
> - */
> -#define DIO_CREDITS 25
> -
>  
>  /*
>   *
> @@ -1082,13 +1073,13 @@ static int ext4_get_block(struct inode *
>  	handle_t *handle = ext4_journal_current_handle();
>  	int ret = 0, started = 0;
>  	unsigned max_blocks = bh_result->b_size >> inode->i_blkbits;
> +	int dio_credits = EXT4_DATA_TRANS_BLOCKS(inode->i_sb);
>  
>  	if (create && !handle) {
>  		/* Direct IO write... */
>  		if (max_blocks > DIO_MAX_BLOCKS)
>  			max_blocks = DIO_MAX_BLOCKS;
> -		handle = ext4_journal_start(inode, DIO_CREDITS +
> -			      2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb));
> +		handle = ext4_journal_start(inode, dio_credits);
>  		if (IS_ERR(handle)) {
>  			ret = PTR_ERR(handle);
>  			goto out;
> @@ -1267,7 +1258,7 @@ static int ext4_write_begin(struct file 
>  				struct page **pagep, void **fsdata)
>  {
>   	struct inode *inode = mapping->host;
> -	int ret, needed_blocks = ext4_writepage_trans_blocks(inode);
> +	int ret, needed_blocks = ext4_writepages_trans_blocks(inode, 1);
>  	handle_t *handle;
>  	int retries = 0;
>   	struct page *page;
> @@ -2153,20 +2144,6 @@ static int ext4_da_writepage(struct page
>  
>  	return ret;
>  }
> -
> -/*
> - * For now just follow the DIO way to estimate the max credits
> - * needed to write out EXT4_MAX_WRITEBACK_PAGES.
> - * todo: need to calculate the max credits need for
> - * extent based files, currently the DIO credits is based on
> - * indirect-blocks mapping way.
> - *
> - * Probably should have a generic way to calculate credits
> - * for DIO, writepages, and truncate
> - */
> -#define EXT4_MAX_WRITEBACK_PAGES      DIO_MAX_BLOCKS
> -#define EXT4_MAX_WRITEBACK_CREDITS    DIO_CREDITS
> -
>  static int ext4_da_writepages(struct address_space *mapping,
>  				struct writeback_control *wbc)
>  {
> @@ -2176,22 +2153,24 @@ static int ext4_da_writepages(struct add
>  	int ret = 0;
>  	long to_write;
>  	loff_t range_start = 0;
> +	int blocks_per_page = PAGE_CACHE_SIZE >> inode->i_blkbits;
> +	int max_credit_blocks = ext4_journal_max_transaction_buffers(inode);
> +	int need_credits_per_page =  ext4_writepages_trans_blocks(inode, 1);
> +	int max_writeback_pages = (max_credit_blocks / blocks_per_page) / need_credits_per_page;
>  
>  	/*
>  	 * No pages to write? This is mainly a kludge to avoid starting
>  	 * a transaction for special inodes like journal inode on last iput()
>  	 * because that could violate lock ordering on umount
>  	 */
> -	if (!mapping->nrpages)
> +	if (!mapping->nrpages || !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
>  		return 0;
>  
> -	/*
> -	 * Estimate the worse case needed credits to write out
> -	 * EXT4_MAX_BUF_BLOCKS pages
> -	 */
> -	needed_blocks = EXT4_MAX_WRITEBACK_CREDITS;
> +	if (wbc->nr_to_write > mapping->nrpages)
> +		wbc->nr_to_write = mapping->nrpages;
>  
>  	to_write = wbc->nr_to_write;
> +
>  	if (!wbc->range_cyclic) {
>  		/*
>  		 * If range_cyclic is not set force range_cont
> @@ -2202,10 +2181,31 @@ static int ext4_da_writepages(struct add
>  	}
>  
>  	while (!ret && to_write) {
> +		/*
> +		 * set the max dirty pages could be write at a time
> +		 * to fit into the reserved transaction credits
> +		 */
> +		if (wbc->nr_to_write > max_writeback_pages)
> +			wbc->nr_to_write = max_writeback_pages;
> +
> +		/*
> +		 * Estimate the worse case needed credits to write out
> +		 * to_write pages
> +		 */
> +		needed_blocks = ext4_writepages_trans_blocks(inode,
> +							     wbc->nr_to_write);
> +		while (needed_blocks > max_credit_blocks) {
> +			wbc->nr_to_write --;
> +			needed_blocks = ext4_writepages_trans_blocks(inode,
> +							     wbc->nr_to_write);
> +		}
>  		/* start a new transaction*/
>  		handle = ext4_journal_start(inode, needed_blocks);
>  		if (IS_ERR(handle)) {
>  			ret = PTR_ERR(handle);
> +			printk(KERN_EMERG "%s: Not enough credits to flush %ld pages\n", __func__,
> +				wbc->nr_to_write);
> +			dump_stack();
>  			goto out_writepages;
>  		}
>  		if (ext4_should_order_data(inode)) {
> @@ -2221,12 +2221,6 @@ static int ext4_da_writepages(struct add
>  			}
>  
>  		}
> -		/*
> -		 * set the max dirty pages could be write at a time
> -		 * to fit into the reserved transaction credits
> -		 */
> -		if (wbc->nr_to_write > EXT4_MAX_WRITEBACK_PAGES)
> -			wbc->nr_to_write = EXT4_MAX_WRITEBACK_PAGES;
>  
>  		to_write -= wbc->nr_to_write;
>  		ret = mpage_da_writepages(mapping, wbc,
> @@ -2587,7 +2581,8 @@ static int __ext4_journalled_writepage(s
>  	 * references to buffers so we are safe */
>  	unlock_page(page);
>  
> -	handle = ext4_journal_start(inode, ext4_writepage_trans_blocks(inode));
> +	handle = ext4_journal_start(inode,
> +				    ext4_writepages_trans_blocks(inode, 1));
>  	if (IS_ERR(handle)) {
>  		ret = PTR_ERR(handle);
>  		goto out;
> @@ -4271,20 +4266,20 @@ int ext4_getattr(struct vfsmount *mnt, s
>  /*
>   * How many blocks doth make a writepage()?
>   *
> - * With N blocks per page, it may be:
> - * N data blocks
> + * With N blocks per page,  and P pages, it may be:
> + * N*P data blocks
>   * 2 indirect block
>   * 2 dindirect
>   * 1 tindirect
> - * N+5 bitmap blocks (from the above)
> - * N+5 group descriptor summary blocks
> + * N*P+5 bitmap blocks (from the above)
> + * N*P+5 group descriptor summary blocks
>   * 1 inode block
>   * 1 superblock.
>   * 2 * EXT4_SINGLEDATA_TRANS_BLOCKS for the quote files
>   *
> - * 3 * (N + 5) + 2 + 2 * EXT4_SINGLEDATA_TRANS_BLOCKS
> + * 3 * (N*P + 5) + 2 + 2 * EXT4_SINGLEDATA_TRANS_BLOCKS
>   *
> - * With ordered or writeback data it's the same, less the N data blocks.
> + * With ordered or writeback data it's the same, less the N*P data blocks.
>   *
>   * If the inode's direct blocks can hold an integral number of pages then a
>   * page cannot straddle two indirect blocks, and we can only touch one indirect
> @@ -4295,30 +4290,49 @@ int ext4_getattr(struct vfsmount *mnt, s
>   * block and work out the exact number of indirects which are touched.  Pah.
>   */
>  
> -int ext4_writepage_trans_blocks(struct inode *inode)
> +static int ext4_writeblocks_trans_credits_old(struct inode *inode, int nrblocks)
>  {
> -	int bpp = ext4_journal_blocks_per_page(inode);
> -	int indirects = (EXT4_NDIR_BLOCKS % bpp) ? 5 : 3;
> +	int indirects = (EXT4_NDIR_BLOCKS % nrblocks) ? 5 : 3;
>  	int ret;
>  
> -	if (EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL)
> -		return ext4_ext_writepage_trans_blocks(inode, bpp);
> -
>  	if (ext4_should_journal_data(inode))
> -		ret = 3 * (bpp + indirects) + 2;
> +		ret = 3 * (nrblocks + indirects) + 2;
>  	else
> -		ret = 2 * (bpp + indirects) + 2;
> +		ret = 2 * nrblocks + 3* indirects + 2;
>  
> -#ifdef CONFIG_QUOTA
>  	/* We know that structure was already allocated during DQUOT_INIT so
>  	 * we will be updating only the data blocks + inodes */
>  	ret += 2*EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
> -#endif
>  
>  	return ret;
>  }
>  
>  /*
> + * Calulate the total number of credits to reserve to fit
> + * the modification of @num pages into a single transaction
> + *
> + * This could be called via ext4_write_begin() or later
> + * ext4_da_writepages() in delalyed allocation case.
> + *
> + * In both case it's possible that we could allocating multiple
> + * chunks of blocks. We need to consider the worse case, when
> + * one new block per extent.
> + *
> + * For Direct IO and fallocate, the journal credits reservation
> + * is based on one single extent allocation, so they could use
> + * EXT4_DATA_TRANS_BLOCKS to get the needed credit to log a single
> + * chunk of allocation needs.
> + */
> +int ext4_writepages_trans_blocks(struct inode *inode, int nrpages)
> +{
> +	int bpp = ext4_journal_blocks_per_page(inode);
> +	int nrblocks = nrpages * bpp;
> +
> +	if (!(EXT4_I(inode)->i_flags & EXT4_EXTENTS_FL))
> +		return ext4_writeblocks_trans_credits_old(inode, nrblocks);
> +	return ext4_ext_writeblocks_trans_credits(inode, nrblocks);
> +}
> +/*
>   * The caller must have previously called ext4_reserve_inode_write().
>   * Give this, we know that the caller already has write access to iloc->bh.
>   */
> Index: linux-2.6.26git6/fs/ext4/migrate.c
> ===================================================================
> --- linux-2.6.26git6.orig/fs/ext4/migrate.c	2008-07-13 14:51:29.000000000 -0700
> +++ linux-2.6.26git6/fs/ext4/migrate.c	2008-07-28 22:53:21.000000000 -0700
> @@ -52,9 +52,11 @@ static int finish_range(handle_t *handle
>  	 * Since we are doing this in loop we may accumalate extra
>  	 * credit. But below we try to not accumalate too much
>  	 * of them by restarting the journal.
> +	 *
> +	 * extra 4 credits for: 1 superblock, 1 inode block, 2 quotas
>  	 */
> -	needed = ext4_ext_calc_credits_for_insert(inode, path);
> -
> +	needed = ext4_ext_calc_credits_for_single_extent(inode, path) + 2
> +		 + 2 * EXT4_QUOTA_TRANS_BLOCKS(inode->i_sb);
>  	/*
>  	 * Make sure the credit we accumalated is not really high
>  	 */
> Index: linux-2.6.26git6/fs/ext4/ext4_extents.h
> ===================================================================
> --- linux-2.6.26git6.orig/fs/ext4/ext4_extents.h	2008-07-28 22:47:22.000000000 -0700
> +++ linux-2.6.26git6/fs/ext4/ext4_extents.h	2008-07-28 22:55:40.000000000 -0700
> @@ -216,7 +216,8 @@ extern int ext4_ext_calc_metadata_amount
>  extern ext4_fsblk_t idx_pblock(struct ext4_extent_idx *);
>  extern void ext4_ext_store_pblock(struct ext4_extent *, ext4_fsblk_t);
>  extern int ext4_extent_tree_init(handle_t *, struct inode *);
> -extern int ext4_ext_calc_credits_for_insert(struct inode *, struct ext4_ext_path *);
> +extern int ext4_ext_calc_credits_for_single_extent(struct inode *inode,
> +						   struct ext4_ext_path *path);
>  extern int ext4_ext_try_to_merge(struct inode *inode,
>  				 struct ext4_ext_path *path,
>  				 struct ext4_extent *);
> Index: linux-2.6.26git6/fs/ext4/ext4_jbd2.h
> ===================================================================
> --- linux-2.6.26git6.orig/fs/ext4/ext4_jbd2.h	2008-07-28 22:47:22.000000000 -0700
> +++ linux-2.6.26git6/fs/ext4/ext4_jbd2.h	2008-07-28 22:53:21.000000000 -0700
> @@ -231,4 +231,14 @@ static inline int ext4_should_writeback_
>  	return 0;
>  }
>  
> +static inline int ext4_journal_max_transaction_buffers(struct inode *inode)
> +{
> +	/*
> +	 * max transaction buffers
> + 	 * calculation based on
> + 	 * journal->j_max_transaction_buffers = journal->j_maxlen / 4;
> + 	 */
> +        return (EXT4_JOURNAL(inode))->j_maxlen / 4;
> +}
> +
>  #endif	/* _EXT4_JBD2_H */
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ