lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 11 Sep 2012 11:22:27 -0500
From:	Terry <td3201@...il.com>
To:	"Theodore Ts'o" <tytso@....edu>
Cc:	linux-ext4@...r.kernel.org
Subject: Re: ext4 won't mount - fsck required - 2nd fsck in less than a week

On Mon, Sep 10, 2012 at 8:56 AM, Terry <td3201@...il.com> wrote:
> On Mon, Sep 10, 2012 at 8:48 AM, Terry <td3201@...il.com> wrote:
>> On Sun, Sep 9, 2012 at 10:18 PM, Terry <td3201@...il.com> wrote:
>>> On Sun, Sep 9, 2012 at 9:53 PM, Terry <td3201@...il.com> wrote:
>>>> On Sun, Sep 9, 2012 at 9:47 PM, Theodore Ts'o <tytso@....edu> wrote:
>>>>> On Sun, Sep 09, 2012 at 09:34:10PM -0500, Terry wrote:
>>>>>>
>>>>>> As the subject says, we have a 15 TB fsck drive that won't mount with
>>>>>> these errors:
>>>>>>
>>>>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): ext4_check_descriptors:
>>>>>> Inode bitmap for group 3200 not in group (block 4161027887)!
>>>>>> Sep 9 20:02:20 narf kernel: EXT4-fs (dm-9): group descriptors corrupted!
>>>>>
>>>>> These indicate a very basic file system corruption where the block
>>>>> group descriptors are corrupted.  E2fsck will complain immediately
>>>>> upon seeing this sort of fs inconsistency, and the first thing it will
>>>>> try to do is fix it.
>>>>>
>>>>>> We did a proactive fsck on Tuesday of last week because it was
>>>>>> starting to give filesystem errors. It ran through and mounted fine.
>>>>>>
>>>>>> The filesystem lives on an equallogic SAN spread across 36 drives.
>>>>>> Could this be something with the physical layer or is it not abnormal
>>>>>> to have to run multiple rounds of fsck to fully fix an issue?
>>>>>
>>>>> This is most probably a hardware problem; normally e2fsck will fix
>>>>> file system corruptions (and certainly problems such as corrupt block
>>>>> group scriptors) in a single pass.  If e2fsck finished and the file
>>>>> system mounted fine last week, and now you're getting this kind of
>>>>> error, it basically screams some kind of physical layer problem, or
>>>>> perhaps a bad hard drive, or perhaps the SAN disk is getting
>>>>> incorrectly written to by some other system, etc.
>>>>>
>>>>>                                      - Ted
>>>>
>>>> Thanks for the reply.  It is part of a RHEL cluster but we did not
>>>> have any situations where multiple systems mounted the filesystem.  It
>>>> is a an old SAN so perhaps we have a physical issue. We'll see what it
>>>> happens with this pass.
>>>
>>> While I am waiting for fsck to finish, another thought. This
>>> filesystem contains a lot of small files. 35,867,642 files to be
>>> exact.  Anything else I should check or know to ensure a smooth
>>> operation for these types of filesystems?  I formatted them with
>>> standard RHEL 6 options.
>>
>> FSCK completed fixing a lot of things.  The file system then mounted
>> without any errors.  We are still getting these types of errors in
>> /var/log/messages:
>>
>> Sep 10 08:40:49 narf kernel: EXT4-fs error (device dm-6):
>> ext4_dx_find_entry: bad entry in directory #743966900: directory entry
>> across blocks - block=2975876794offset=0(946176), inode=1414751737,
>> rec_len=45724, name_len=206
>>
>> Thoughts?
>
> Hold that thought.  This is another filesystem.  Let me fix that one
> then come back to this problem if it still exists.

Ok, fixed the other filesystem (dm-6) yesterday.  Today, getting these
errors still on it:
Sep 11 11:17:47 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 90851: 0 blocks in bitmap, 5048
in gd
Sep 11 11:18:17 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 90670: 0 blocks in bitmap, 6665
in gd
Sep 11 11:19:31 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 37589: 420 blocks in bitmap,
8302 in gd
Sep 11 11:19:31 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 71777: 7071 blocks in bitmap,
23711 in gd
Sep 11 11:19:31 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 71778: 10664 blocks in bitmap,
26624 in gd
Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 13499: 9884 blocks in bitmap,
1256 in gd
Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 13498: 383 blocks in bitmap,
384 in gd
Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 13496: 2356 blocks in bitmap,
10453 in gd
Sep 11 11:19:39 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 13497: 3593 blocks in bitmap,
5641 in gd
Sep 11 11:19:50 omadvnfs01a kernel: EXT4-fs error (device dm-6):
ext4_mb_generate_buddy: EXT4-fs: group 49528: 25850 blocks in bitmap,
29946 in gd
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists