linux-ext4 - Re: [dm-devel] can't recover ext4 on lvm from ext4_mb_generate_buddy:739: group 1687, 32254 clusters in bitmap, 32258 in gd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <loom.20120128T162211-867@post.gmane.org>
Date:	Sat, 28 Jan 2012 15:31:59 +0000 (UTC)
From:	WIMPy <WIMPy@...i.dk>
To:	linux-ext4@...r.kernel.org
Subject: Re: [dm-devel] can't recover ext4 on lvm from ext4_mb_generate_buddy:739: group 1687, 32254 clusters in bitmap, 32258 in gd

Andreas Dilger <adilger <at> dilger.ca> writes:

> 
> Could you please try to bisect the problem, if it is reproducible?

If you or someone else has an idea, how to do so, I will try to collect more 
information.

There is actually an important bit I forgot to mention in the last message: 
After I got the error and umount the FS I get lots of journal commit I/O 
errors. But no indication as to what or why it fails.

> I was looking for a change which I thought might be responsible (removal of 
block bitmap initialization
> when inodes are first allocated from an uninitialized inode table) but I 
couldn't see it in the git log, so
> maybe that change has not landed yet.
> 
> I don't have any other ideas of which recent patches might be responsible at 
this point. 

As there was a mention at the beginning that this may have happened after an 
upgrade from 3.1.5 to 3.2, I will build a 3.1.5 and see if that really makes a 
difference.

> On 2012-01-28, at 1:14, WIMPy <WIMPy <at> yeti.dk> wrote:
> 
> > Update:
> > 
> >>>>> this is a problem which apparently occurred when the user went from
> >>>> v3.1.5 to v3.2, so this looks likes 3.2 regression.)
> >> 
> >> I am on 3.2.0 as well.
> > 
> > I didn't spot anything obvious in the logs.
> > 
> >> It happened for me on a freshly created FS.
> >> "mke2fs -j -O sparse_super -O dir_index -O extents -O filetype -O uninit_
bg"
> >> mounted with no additional options for the first time I got an
> >> "EXT4-fs error (device md127): ext4_mb_generate_buddy:739: group 28671, 
32765 
> >> clusters in bitmap, 32766 in gd"
> >> after writing about 3TB of data.
> >> I do not have RO snapshots as the OP, but my md sits on to of luks 
> > containers. 
> >> So we do have the device mapper in common.
> > 
> > After I did an fsck and tried to continue, I didn't get that far.
> > After another 200GB or so it happened again.
> > And now it's reproducible:
> > I can run fsck and then try to continue (using rsync). But as soon as 
writing 
> > starts, the process hangs for a long time. At least one minute, probably 
longer.
> > Then the ext4_mb_generate_buddy comes again.
> > 
> > I upgraded e2fstools from 1.41.14 to 1.42 and the kernel to 3.2.2.
> > No difference.
> > That FS is unusable.
> > 
> >> Just for the records: Unlike the contents, the hardware is not new and did 
> > not 
> >> have any known issues.
> >> 
> >>  Greetings,
> >>    WIMPy


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html