[<prev] [next>] [day] [month] [year] [list]
Message-Id: <549505B6-A176-4061-93C2-78C6EE094A2B@dilger.ca>
Date: Fri, 6 Nov 2015 01:23:20 -0700
From: Andreas Dilger <adilger@...ger.ca>
To: Theodore Ts'o <tytso@....edu>
Cc: linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Fwd: e2fsck -fD corruption of large htree/extent directory
Ted, per our discussion this morning, here are the details of the
e2fsck -fD corruption problem we saw.
Running e2fsck -fD on a large extent+htree directory (> 300k entries,
1600+ filesystem blocks) showed corruption on a large number of dirs.
This is definitely caused by a bug in the code rather than hardware, as
this corrupted multiple large directories on 11 different systems.
Sometimes, similar directories on the same systems did not have errors.
As yet the reason and mechanism has not been determined, but it may
relate to the filesystem history (the directories may have originally
been block mapped, an in any case the blocks are mostly discontiguous
on disk). These dirs undergo continuous insertion and deletion of
entries with ~10-character filenames, so the leaf blocks may have become
quite fragmented over time.
Running e2fsck on the filesystem showed:
e2fsck 1.42.12.wc1 (15-Sep-2014)
MMP interval is 5 seconds and total wait time is 22 seconds. Please wait.
Pass 1: Checking inodes, blocks, and sizes
Interior extent node level 1 of inode 39321606:
Logical start 1430 does not match logical start 1875 at next level.
Fix? yes
Inode 39321606, end of extent exceeds allowed value
(logical block 1875, physical block 1258402260, len 1)
Clear? yes
Failed to iterate extents in inode 39321606
(op EXT2_EXTENT_UP, blk 1258402260, lblk 1875): No 'up' extent
Clear inode? yes
Inode 39321606 is a zero-length directory. Clear? yes
Update quota info for quota type 0? yes
Update quota info for quota type 1? yes
Restarting e2fsck from the beginning...
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Entry 'd2' in /O/0 (39321602) has deleted/unused inode 39321606.
Clear? yes
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 147
Connect to /lost+found? yes
Inode 147 ref count is 2, should be 1. Fix? yes
Unattached inode 173
Connect to /lost+found? yes
Inode 173 ref count is 2, should be 1. Fix? yes
:
:
Unattached inode 92016391
Connect to /lost+found? yes
Inode 92016391 ref count is 2, should be 1. Fix? yes
Pass 5: Checking group summary information
Block bitmap differences: -1258308100
Update quota info for quota type 0? yesm
Update quota info for quota type 1? yes
scratch-OST0049: ***** FILE SYSTEM WAS MODIFIED *****
Stat data for the corrupted directory inode:
debugfs -c -R "stat <39321606>"
Inode: 39321606 Type: directory Mode: 0700 Flags: 0x81000
Generation: 2310511783 Version: 0x00000000:00000000
User: 0 Group: 0 Size: 6750208
File ACL: 0 Directory ACL: 0
Links: 2 Blockcount: 13232
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x563111cf:15fb2694 -- Wed Oct 28 14:19:59 2015
atime: 0x52f30c97:9fe5c3ac -- Wed Feb 5 23:16:23 2014
mtime: 0x563111cf:15fb2694 -- Wed Oct 28 14:19:59 2015
crtime: 0x52f30c97:9fe5c3ac -- Wed Feb 5 23:16:23 2014
Size of extra inode fields: 28
Extended attributes stored in inode body:
invalid EA entry in inode
EXTENTS:
[shown below]
The debugfs dump_extents command shows that the extent tree is mostly OK.
In all observed cases, the extent tree was 5 blocks long (possibly a
result of 4 extent blocks being moved out of the in-inode i_block[]
array and into an external second-level index block), or because the
number of entries in each directory is roughly the same, not sure.
Level Entries Logical Physical Length Flags
0/ 2 1/ 1 0 - 1647 1258392344 1648
1/ 2 1/ 5 0 - 353 1258308301 354
2/ 2 1/340 0 - 0 1258308100 - 1258308100 1
2/ 2 2/340 1 - 2 1258308174 - 1258308175 2
2/ 2 3/340 3 - 3 1258308213 - 1258308213 1
2/ 2 4/340 4 - 4 1258308241 - 1258308241 1
:
:
2/ 2 339/340 352 - 352 1258319291 - 1258319291 1
2/ 2 340/340 353 - 353 1258319375 - 1258319375 1
1/ 2 2/ 5 354 - 704 1258319416 351
2/ 2 1/340 354 - 354 1258319415 - 1258319415 1
2/ 2 2/340 355 - 355 1258319470 - 1258319470 1
:
:
2/ 2 339/340 703 - 703 1258350886 - 1258350886 1
2/ 2 340/340 704 - 704 1258350895 - 1258350895 1
1/ 2 3/ 5 705 - 1055 1258350929 351
2/ 2 1/339 705 - 705 1258350928 - 1258350928 1
2/ 2 2/339 706 - 706 1258343948 - 1258343948 1
:
:
2/ 2 336/339 1052 - 1052 1258365348 - 1258365348 1
2/ 2 337/339 1053 - 1053 1258365355 - 1258365355 1
2/ 2 338/339 1054 - 1054 1258365417 - 1258365417 1
2/ 2 339/339 1055 - 1055 1258365432 - 1258365432 1
1/ 2 4/ 5 1056 - 1874 1258324458 819
2/ 2 1/340 1056 - 1056 1258365435 - 1258365435 1
2/ 2 2/340 1057 - 1057 1258366983 - 1258366983 1
2/ 2 3/340 1058 - 1059 1258366993 - 1258366994 2
:
:
2/ 2 338/340 1427 - 1427 1258379312 - 1258379312 1
2/ 2 339/340 1428 - 1428 1258379117 - 1258379117 1
2/ 2 340/340 1429 - 1429 1258379133 - 1258379133 1
1/ 2 5/ 5 1875 - 4294968943 1258406330 4294967069
2/ 2 1/ 1 1875 - 1875 1258402260 - 1258402260 1
The 4/5 extent index block shows an extent length of 1874 - 1056 = 819
blocks, but the extent block only has 1429 - 1056 = 373 blocks in the
extent. The extent root block reports 1648 blocks, which matches both
i_size and i_blocks. There appears to be one block missing from the
extent tree, or it was clobbered by 5/5 during an update, and/or the
starting offset of block 5/5 is just wrong.
There doesn't appear to be any other data corruption in the filesystem
besides the directory extent blocks, but this resulted in several
hundred leaf blocks being lost per directory, resulting in millions of
files in lost+found (see my other recent email on that topic).
In some cases, it appears that 100% of files were readable from the
corrupted directory using debugfs _before_ the e2fsck was run:
debugfs -c -R "ls -l $DIR" $DEV
even though e2fsck was unhappy with the extent structure and cleared
part of the extent tree and dumped the files into lost+found. This
implies that the directory entries were all moved into the first blocks
of the directory (i.e. leaf blocks under extent indices 1/5..4/5, and
the blocks in the corrupt part of the directory were somehow "extra" and
the bug lies in the extent handling when shrinking the directory.
Cheers, Andreas
Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)
Powered by blists - more mailing lists