[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <bug-203943-13602@https.bugzilla.kernel.org/>
Date: Fri, 21 Jun 2019 06:51:27 +0000
From: bugzilla-daemon@...zilla.kernel.org
To: linux-ext4@...r.kernel.org
Subject: [Bug 203943] New: ext4 corruption after RAID6 degraded; e2fsck skips
block checks and fails
https://bugzilla.kernel.org/show_bug.cgi?id=203943
Bug ID: 203943
Summary: ext4 corruption after RAID6 degraded; e2fsck skips
block checks and fails
Product: File System
Version: 2.5
Kernel Version: 4.19.52-gentoo
Hardware: Intel
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: ext4
Assignee: fs_ext4@...nel-bugs.osdl.org
Reporter: yann@...anns.net
Regression: No
I use a 24TB SW-RAID6 with 10 3TB HDDs. This array contains a dmcrypt
container, which contains a ext4 FS [1].
One of the disks had errors and got kicked out of the array. Before I was able
to replace it, the ext4 FS began to throw errors like these:
EXT4-fs error (device dm-1): ext4_find_dest_de:1802: inode #3833864: block
61343924: comm nfsd: bad entry in directory: rec_len % 4 != 0 - offset=1000,
inode=2620025549, rec_len=30675, name_len=223, size=4096
EXT4-fs error (device dm-1): ext4_lookup:1577: inode #172824586: comm
tvh:tasklet: iget: bad extra_isize 13022 (inode size 256)
EXT4-fs error (device dm-1): htree_dirblock_to_tree:1010: inode #7372807: block
117967811: comm tar: bad entry in directory: rec_len % 4 != 0 - offset=104440,
inode=1855122647, rec_len=12017, name_len=209, size=4096
I then used e2fsck to check the FS for errors, but it only created dozens of
the following output lines:
German original: "Block %$b von Inode %$i steht in Konflikt mit kritischen
Metadaten, Blockprüfungen werden übersprungen."
Translated: "Inode %$i block %$b conflicts with critical metadata, skipping
block checks."
It also filled up my RAM, so I added ~100G of swap space. After scanning for a
few days, the e2fsck process died.
Afterwards I was able to replace the disk, re-scan the FS with e2fsck and after
2 hours, the ext4 FS was clean again.
To understand the problem, I set one of the disks faulty, and suddenly the ext4
errors occurred again. I added a spare disk, resynced the array, but e2fsck is
unable to fix the errors, and only throws "Inode %$i block %$b conflicts with
critical metadata, skipping block checks." while using more and more RAM.
Used software versions:
Kernel 4.19.52-gentoo
e2fsprogs-1.44.5
mdadm-4.1
Please let me know, if I should provide additional information.
Kind regards,
Yann
########################################## additional information
#e2fsck -c /dev/mapper/share
e2fsck 1.44.5 (15-Dec-2018)
badblocks: ungültige letzter Block - 5860269055
/dev/mapper/share: Updating bad block inode.
Durchgang 1: Inodes, Blöcke und Größen werden geprüft
Block %$b von Inode %$i steht in Konflikt mit kritischen Metadaten,
Blockprüfungen werden übersprungen.
[...]
Block %$b von Inode %$i steht in Konflikt mit kritischen Metadaten,
Blockprüfungen werden übersprungen.
Signal (6) SIGABRT si_code=SI_TKILL
e2fsck(+0x33469)[0x55ec4942a469]
/lib64/libc.so.6(+0x39770)[0x7f562449f770]
/lib64/libc.so.6(gsignal+0x10b)[0x7f562449f6ab]
/lib64/libc.so.6(abort+0x123)[0x7f5624488539]
/lib64/libext2fs.so.2(+0x18af5)[0x7f5624ac3af5]
e2fsck(+0x18c30)[0x55ec4940fc30]
/lib64/libext2fs.so.2(+0x1a1ec)[0x7f5624ac51ec]
/lib64/libext2fs.so.2(+0x1a589)[0x7f5624ac5589]
/lib64/libext2fs.so.2(+0x1b04b)[0x7f5624ac604b]
e2fsck(+0x1978c)[0x55ec4941078c]
e2fsck(+0x1b0f6)[0x55ec494120f6]
e2fsck(+0x1b1dc)[0x55ec494121dc]
/lib64/libext2fs.so.2(ext2fs_get_next_inode_full+0x8e)[0x7f5624adc53e]
e2fsck(e2fsck_pass1+0xa12)[0x55ec49412d22]
e2fsck(e2fsck_run+0x6a)[0x55ec4940b54a]
e2fsck(main+0xefd)[0x55ec4940703d]
/lib64/libc.so.6(__libc_start_main+0xeb)[0x7f5624489e6b]
e2fsck(_start+0x2a)[0x55ec494092ea]
---------------------------------------------------------------------
#tune2fs -l /dev/mapper/share
tune2fs 1.44.5 (15-Dec-2018)
Filesystem volume name: <none>
Last mounted on: /home/share
Filesystem UUID: c5f0559d-e3bd-473f-abc0-7c42b3115897
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: ext_attr dir_index filetype extent 64bit flex_bg
sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 366268416
Block count: 5860269056
Reserved block count: 0
Free blocks: 755383351
Free inodes: 363816793
First block: 0
Block size: 4096
Fragment size: 4096
Group descriptor size: 64
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 2048
Inode blocks per group: 128
RAID stride: 128
RAID stripe width: 1024
Flex block group size: 16
Filesystem created: Sat Mar 17 14:36:16 2018
Last mount time: Fri Jun 21 05:25:34 2019
Last write time: Fri Jun 21 05:30:27 2019
Mount count: 3
Maximum mount count: -1
Last checked: Thu Jun 20 08:55:17 2019
Check interval: 0 (<none>)
Lifetime writes: 139 GB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Default directory hash: half_md4
Directory Hash Seed: 4c37872e-3207-4ff4-8939-a428feaeb49f
Journal backup: inode blocks
FS Error count: 20776
First error time: Thu Jun 20 14:18:47 2019
First error function: ext4_lookup
First error line #: 1577
First error inode #: 172824586
First error block #: 0
Last error time: Fri Jun 21 05:53:24 2019
Last error function: ext4_lookup
Last error line #: 1577
Last error inode #: 172824586
Last error block #: 0
---------------------------------------------------------------------
#cat /proc/mdstat
Personalities : [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
[faulty]
md2 : active raid6 sde[0] sdc[9] sdb[8] sdj[7] sdi[6] sda[11] sdd[10] sdh[3]
sdg[2] sdf[1]
23441080320 blocks super 1.2 level 6, 512k chunk, algorithm 2 [10/10]
[UUUUUUUUUU]
bitmap: 0/22 pages [0KB], 65536KB chunk
md1 : active raid1 sdk2[0] sdl2[2]
52396032 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdk1[0] sdl1[2]
511680 blocks super 1.2 [2/2] [UU]
unused devices: <none>
--
You are receiving this mail because:
You are watching the assignee of the bug.
Powered by blists - more mailing lists