[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <6601abe90909230927m6d45cd75wef3525fc23837110@mail.gmail.com>
Date: Wed, 23 Sep 2009 09:27:11 -0700
From: Curt Wohlgemuth <curtw@...gle.com>
To: ext4 development <linux-ext4@...r.kernel.org>
Subject: ext4 inode corruption
We've been seeing sporadic inode corruption on our ext4 partitions which
we've been trying to analyze, without much success. I'm wondering if
anybody might have some clues as to where things might be going wrong.
We find out about the corruption via a BUG firing in ext4_ext_get_blocks():
/*
* consistent leaf must not be empty;
* this situation is possible, though, _during_ tree modification;
* this is why assert can't be put in ext4_ext_find_extent()
*/
BUG_ON(path[depth].p_ext == NULL && depth != 0);
Of course, this fires long after the inode in question is corrupted. With
some diagnostics added in front of this bug, we can find the inodes; they
all have characteristics like this:
Output from debugfs' stat command:
Inode: 1195575 Type: regular Mode: 0600 Flags: 0x80000
Generation: 2821101782 Version: 0x00000001
User: 35800 Group: 5000 Size: 8400896
File ACL: 0 Directory ACL: 0
Links: 1 Blockcount: 8
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x4a9f8009 -- Thu Sep 3 01:36:25 2009
atime: 0x4a9f7ff7 -- Thu Sep 3 01:36:07 2009
mtime: 0x4a9f8009 -- Thu Sep 3 01:36:25 2009
EXTENTS:
Note that no data blocks are printed out here.
Following the actual extent tree, it always looks like this:
in-inode extent header:
eh_magic: 0xf30a
eh_entries: 1
eh_max: 4
eh_depth: 1
in-inode extent index 0:
ei_block: 0
ei_leaf_lo: 36738577
ei_leaf_hi: 0
leaf node header (at block 36738577):
eh_magic: 0xf30a
eh_entries: 0
eh_max: 340
eh_depth: 0
The i_size value of the inode will vary, from 8192 to 8400896. But the
i_blocks value is *always* 8.
The extent tree always has depth of 1 in the in-inode header, and a valid
leaf node header; but the leaf node header always has 0 entries. This is
what's causing the BUG above to fire.
We believe the general pattern of user space calls to create these files is
something like this:
open(O_DIRECT)
fallocate(fd, FALLOC_FL_KEEP_SIZE, 0, 8400896)
< various writes to the file >
fallocate(fd, 0, 0, actual_size + BLOCK_SIZE)
ftruncate(fd, actual_size)
The second fallocate() call without KEEP_SIZE allows the following
ftruncate to actually truncate the file -- a known issue recently fixed by
Jiaying Zhang (but her fix is not in our kernel yet). "actual_size" can be
0 at times.
I can't think of any actions that would cause the i_size to be so large, yet
the i_blocks always be 8. Looking at the code in
ext4_ext_remove_space()
ext4_ext_rm_leaf()
ext4_ext_rm_idx()
I don't see a way for the extent tree to take the shape above. There are no
errors that I can see around the time the corrupted inodes are created. It
*seems* as though the corruption is coming during truncation, but all our
efforts to reproduce this with small test cases have so far failed.
We're using a 2.6.26 code base, with most of the latest ext4 patches
applied.
Any insights/ruminations/guesses as to what might be happening are welcome.
Thanks,
Curt
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists