[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <20080820234208.GO3392@webber.adilger.int>
Date: Wed, 20 Aug 2008 17:42:08 -0600
From: Andreas Dilger <adilger@....com>
To: Mingming Cao <cmm@...ibm.com>
Cc: Theodore Tso <tytso@....edu>,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: ENOSPC returned during writepages
On Aug 20, 2008 16:22 -0700, Mingming Cao wrote:
> ext4: fall back to non delalloc mode if filesystem is almost full
> From: Mingming Cao <cmm@...ibm.com>
>
> In the case of filesystem is close to full (free blocks is below
> the watermark NRCPUS *4) and there is not enough to reserve blocks for
> delayed allocation, instead of return user back with ENOSPC error, with
> this patch, it tries to fall back to non delayed allocation mode.
I don't think that making a low watermark of only 4 blocks is enough,
because each of the per-CPU counters could be off by as much as FBC_BATCH.
I think dropping delalloc support earlier is safer, something like
(FBC_BATCH * NR_CPUS).
> +static int ext4_write_begin_nondelalloc(struct file *file,
> + struct address_space *mapping,
> + loff_t pos, unsigned len, unsigned flags,
> + struct page **pagep, void **fsdata)
> +{
> + struct inode *inode = mapping->host;
> +
> + /* turn off delalloc for this inode*/
> + ext4_set_aops(inode, 0);
> +
> + return mapping->a_ops->write_begin(file, mapping, pos, len,
> + flags, pagep, fsdata);
> +}
Hmm, I don't understand this - isn't delalloc already off here, because
this is "ext4_write_begin_nondelalloc()"?
> +void ext4_set_aops(struct inode *inode, int delalloc)
> {
> + if (test_opt(inode->i_sb, DELALLOC)) {
> + if (ext4_has_free_blocks(EXT4_SB(inode->i_sb),
> + EXT4_MIN_FREE_BLOCKS) > EXT4_MIN_FREE_BLOCKS)
> + delalloc = 0;
> +
> + if (delalloc) {
> + inode->i_mapping->a_ops = &ext4_da_aops;
> + return;
> + } else
> + printk(KERN_INFO "filesystem is close to full, "
> + "delayed allocation is turned off for "
> + " inode %lu\n", inode->i_ino);
> + }
Also, if you are doing this by changing the aops on the inode, isn't
it possible that a large write starts outside the EXT4_MIN_FREE_BLOCKS
boundary and then still runs out of space without changing the aops?
Instead it is maybe better to do the check at the start of
ext4_da_write_begin() and if it fails then call the non-delalloc
write_begin from there?
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists