[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bab0ea88-0f0a-3445-e24e-286ee0665044@uls.co.za>
Date: Wed, 23 May 2018 16:46:25 +0200
From: Jaco Kroon <jaco@....co.za>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4 <linux-ext4@...r.kernel.org>,
Pieter Kruger <pieter@....co.za>
Subject: Re: corrupt filesystem, superblock/journal - fsck
Hi,
So I tracked down the fsck issue ... with a bit of additional debug
output in the lib/ext2fs/openfs.c file, the if statement that is failing
is this one:
if (fs->group_desc_count * EXT2_INODES_PER_GROUP(fs->super) !=
fs->super->s_inodes_count) {
fprintf(stderr, "\ngroup to inodes problem ...
group_desc_count=%u, inodes/group=%u, inode_count=%u\n\n",
fs->group_desc_count, EXT2_INODES_PER_GROUP(fs->super),
fs->super->s_inodes_count);
retval = EXT2_ET_CORRUPT_SUPERBLOCK;
goto cleanup;
}
And this gives us:
group to inodes problem ... group_desc_count=524288, inodes/group=8192,
inode_count=4294967295
524288 * 8192 = 4294967296
As a result, and value other than 0 in the inode_count file will result
in fsck refusing to fsck the filesystem, however, mounting with that bit
of corruption does in fact work, so whilst fsck will not function at the
moment at least the filesystem is mounted, but this will need to be
sorted out somehow.
I suspect this boils down to two things:
1. The kernel (as well as offline resize) needs to prevent resizes
pushing inode count >= 2^32 (or if it hits exactly that just limit to
2^32-1).
2. fsck needs to be made aware of this.
I've now used debugfs to set the inode count to 2^32-1, and the kernel
is quite happy with this, but none of the userspace tools will currently
operate on the filesystem.
Thank you so much for helping me to get the filesystem online again.
The tools and kernel will need to be fixed however in order to ensure
that there are not going to be problems going forward.
Kind Regards,
Jaco
On 23/05/2018 13:37, Jan Kara wrote:
> Hi,
>
> On Mon 21-05-18 14:21:33, Jaco Kroon wrote:
>> We had a host starting to fail processing on an ext4 filesystem directly
>> after extend from 60.5TB to 64TB (lvresize -L64T /dev/lvm/home,
>> resize2fs /dev/lvm/home).
>>
>> We rebooted, and now the filesystem will mount but the problem
>> persists. We've now umounted the filesystem, and fsck complains as follows:
>>
>> crowsnest ~ # fsck.ext4 -f /dev/lvm/home
>> e2fsck 1.43.6 (29-Aug-2017)
>> Superblock has an invalid journal (inode 8).
>> Clear<y>? yes
>> *** journal has been deleted ***
>>
>> Corruption found in superblock. (inodes_count = 0).
>>
>> The superblock could not be read or does not describe a valid ext2/ext3/ext4
>> filesystem. If the device is valid and it really contains an ext2/ext3/ext4
>> filesystem (and not swap or ufs or something else), then the superblock
>> is corrupt, and you might try running e2fsck with an alternate superblock:
>> e2fsck -b 8193 <device>
>> or
>> e2fsck -b 32768 <device>
>>
>> Corruption found in superblock. (first_ino = 11).
>>
>> The superblock could not be read or does not describe a valid ext2/ext3/ext4
>> filesystem. If the device is valid and it really contains an ext2/ext3/ext4
>> filesystem (and not swap or ufs or something else), then the superblock
>> is corrupt, and you might try running e2fsck with an alternate superblock:
>> e2fsck -b 8193 <device>
>> or
>> e2fsck -b 32768 <device>
>>
>> Inode count in superblock is 0, should be 4294967295.
>> Fix<y>? yes
>>
>> /dev/lvm/home: ***** FILE SYSTEM WAS MODIFIED *****
> OK, so the Inode count is obviously wrong and the remaining errors are due
> to that. Apparently the resize process has overflown the inode count to 0
> (which is not that surprising since the number of inodes in your filesystem
> would be 1<<32) - that needs fixing but let's first get your fs up and
> running. I'm actually surprised that e2fsck did anything with the
> filesystem because for me both 1.44.2 and 1.42.11 versions just exit after
> printing the error about the corrupted superblock. Anyway what *could* fix
> your problem is:
>
> debugfs -w -R 'ssv inodes_count 4294967295' /dev/lvm/home
>
> and then check with dumpe2fs that inode count indeed got fixed. Hope it
> helps.
>
> Honza
>
Powered by blists - more mailing lists