lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <20090818024741.GF5931@webber.adilger.int>
Date:	Mon, 17 Aug 2009 20:47:41 -0600
From:	Andreas Dilger <adilger@....com>
To:	Frank Mayhar <fmayhar@...gle.com>
Cc:	linux-ext4@...r.kernel.org, tytso@....edu
Subject: Re: fsck infinite loop on corrupt ext4 file system

On Aug 17, 2009  18:10 -0700, Frank Mayhar wrote:
> I've made a little more progress since Friday.  I had grabbed a dumpe2fs
> dump of the corrupted file system and one of the newly-created file
> system on the same device.  Adjusting for normal variation (numbers of
> free blocks, flags, etc.), there are no differences _except_ in the very
> block groups that fsck complained about having bad checksums.  For those
> (and only those), the locations of the block bitmap and inode table
> differ.  I've attached the diff output.

It doesn't appear that the two filesystems were created with the same
options, or one of the filesystems was resized or something.

> In particular, block group 276 claims to have its inode table at blocks
> 0-204, which is clearly wrong.  This is the block group for which the
> allocation failed, causing the original loop.
> 
> It's clear that fsck is neither correcting the block groups nor is it
> detecting the bad entries properly (a sanity check might be in order
> here).  It's not even noticing that it's looping, it just keeps failing
> the allocation and retrying.  While it may be that fsck can't recover
> the file system in this case, it should at least notice and abort.
> 
> My thinking is that the location of the inode tables should be invariant
> over the life of the file system.  Certainly there's no place in ext4
> itself that changes those fields (that I can see, anyway).  Why couldn't
> fsck compute the proper values and compare those against what's there?

With the addition of FLEX_BG there is no longer a hard & fast rule for
the location of the block groups' metadata.  In the past it was always
guaranteed to be within the group itself, now it can be anywhere.

>  Group 276: (Blocks 9043968-9076735)
> -  Block bitmap at 9043968 (+0), Inode bitmap at 9043969 (+1)
> -  Inode table at 0-204
> +  Block bitmap at 8912900, Inode bitmap at 8912916
> +  Inode table at 8913748-8913952

This is definitely bogus and should be detected/fixed by e2fsck.  I
suspect it used to be handled (pre-flexbg) by the check that the inode
table is within the group, but now there is no sanity check for the
placement at all (including overlapping with other groups, superblocks,
etc.

It makes sense to still validate the sanity of the group descriptor
data, and then check the backup group descriptors if the primaries
are suspicious.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ