linux-ext4 - Re: ext4 won't mount - fsck required

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHSRzpDATS+yMDyw6Wts2DMjyDpJx0rVB5M8QLpKxPFUwHWzXA@mail.gmail.com>
Date:	Sun, 9 Sep 2012 22:18:34 -0500
From:	Terry <td3201@...il.com>
To:	"Theodore Ts'o" <tytso@....edu>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: ext4 won't mount - fsck required - 2nd fsck in less than a week

On Sun, Sep 9, 2012 at 9:53 PM, Terry <td3201@...il.com> wrote:
> On Sun, Sep 9, 2012 at 9:47 PM, Theodore Ts'o <tytso@....edu> wrote:
>> On Sun, Sep 09, 2012 at 09:34:10PM -0500, Terry wrote:
>>>
>>> As the subject says, we have a 15 TB fsck drive that won't mount with
>>> these errors:
>>>
>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): ext4_check_descriptors:
>>> Inode bitmap for group 3200 not in group (block 4161027887)!
>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): group descriptors corrupted!
>>
>> These indicate a very basic file system corruption where the block
>> group descriptors are corrupted.  E2fsck will complain immediately
>> upon seeing this sort of fs inconsistency, and the first thing it will
>> try to do is fix it.
>>
>>> We did a proactive fsck on Tuesday of last week because it was
>>> starting to give filesystem errors. It ran through and mounted fine.
>>>
>>> The filesystem lives on an equallogic SAN spread across 36 drives.
>>> Could this be something with the physical layer or is it not abnormal
>>> to have to run multiple rounds of fsck to fully fix an issue?
>>
>> This is most probably a hardware problem; normally e2fsck will fix
>> file system corruptions (and certainly problems such as corrupt block
>> group scriptors) in a single pass.  If e2fsck finished and the file
>> system mounted fine last week, and now you're getting this kind of
>> error, it basically screams some kind of physical layer problem, or
>> perhaps a bad hard drive, or perhaps the SAN disk is getting
>> incorrectly written to by some other system, etc.
>>
>>                                      - Ted
>
> Thanks for the reply.  It is part of a RHEL cluster but we did not
> have any situations where multiple systems mounted the filesystem.  It
> is a an old SAN so perhaps we have a physical issue. We'll see what it
> happens with this pass.

While I am waiting for fsck to finish, another thought. This
filesystem contains a lot of small files. 35,867,642 files to be
exact.  Anything else I should check or know to ensure a smooth
operation for these types of filesystems?  I formatted them with
standard RHEL 6 options.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html