lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 10 Oct 2017 14:36:59 -0600
From:   Andreas Dilger <adilger@...ger.ca>
To:     Kilian Cavalotti <kilian.cavalotti.work@...il.com>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: Recover from a "deleted inode referenced" situation

On Oct 5, 2017, at 3:31 PM, Kilian Cavalotti <kilian.cavalotti.work@...il.com> wrote:
> 
> Dear ext4 experts,
> 
> TL;DR: I messed up a large filesystem, which now references deleted
> inodes. What's the best way to recover from this and hopefully
> reconstruct at least part of the directory hierarchy?

If the problem is only one of the partition being misaligned compared
to the logical volume, you can run the "findsuper" utility which is
part of e2fsprogs *sources* (it isn't built and packaged by default).
It will scan your block device and print out the ext2/3/4 superblocks
that it finds, along with the *byte* offset of each one found.  You
can use this to determine where the start of the filesystem should be.

This is made *much* more complex if you have other LVs on the same
storage, and the LV was increased in size over multiple iterations,
resulting in a fragmented allocation of PEs.

> Full version:
> 
> I'm writing as a last recourse before committing data seppuku. I
> failed to observe rule #1 of disaster recovery (sit on your hands) and
> made a bad situation significantly worse. So I'm trying to figure out
> how badly I'm screwed, and if there's any hope of salvation.
> 
> To set the stage, I have (sniff, *had*) an ext4 filesystem sitting on
> a LVM logical volume, on top of a RAID5 dmraid volume. The dmraid
> volume was expanded, then the LVM logical volume, and the ext4
> filesystem was resize2fs'ed. Except somewhere in the process,
> something failed and the ext4 filesystem was damaged. I unfortunately
> don't really know much more about the failure.
> 
> At that point, the filesystem could be mounted read-only by using a
> backup superblock (mount -o ro,sb=131072), and a quick glance at it
> showed a decent directory structure, with at least top-level
> directories intact.
> 
> So I jumped on it and started exfiltrating data from the damaged
> filesystem to an external system. Now, and that's what will cause me
> sorrow forever, I inadvertently remounted that filesystem read-write
> while the transfer was running...
> 
> Of course, it soon started to throw errors about deleted inodes, like this:
> 
> EXT4-fs error (device dm-0): ext4_lookup:1644: inode #2: comm rsync:
> deleted inode referenced: 1517
> 
> At that point, listing the root of the filesystem generated I/O errors
> and dreadful question marks, where it displayed a valid directory
> before the r/w remount:
> 
> $ ls /vol
> ls: cannot access backup: Input/output error
> drwxr-xr-x 2 root root 4096 Sep 28 11:10 .
> drwxr-xr-x 4 root root 4096 Sep 14  2013 ..
> -????????? ? ?    ?       ?            ? backup
> [...]
> 
> I re-remounted read-only as soon as I realized my mistake, but the
> filesystem stayed mounted r/w for a few minutes.

It sounds like this replayed a corrupted journal over the rest of your
filesystem, leading to further corruption.

> That's where I'm at right now. I'm dd'ing the LVM device to another
> system before doing anything else, and while this is running (it will
> take a few days, as the filesystem size is close to 20TB), I'm
> pondering options.
> 
> I guess the next logical step would be to run fsck, but I'm very
> worried that I will end up with mess of detached inodes in /lost+found
> without any way to figure out their original location in the
> filesystem...
> 
> I read about ways to run fsck without touching the underlying
> filesystem (or image) with a LVM snapshots, or getting a copy of the
> metadata information with e2image, but I'm not really sure how to
> proceed.
> 
> Could anybody provide pointers or advice on what to do next?

My only recommendation would be to update to the latest e2fsprogs,
since it usually fixes important issues found in older versions.

> Is there a way to undo the latest modifications done while the
> filesystem was mounted r/w?

Seems unlikely, unless you have an LVM snapshot.

> Do I have any chance to recover the initial structure and
> contents of my filesystem?

e2fsck is good at recovering what files are available, much better
than other filesystem recovery tools, but it can only work with the
data it has.

> 
> I can obviously provide all the required information, just didn't want
> to make an already long email even longer.
> 
> 
> Thanks!
> --
> Kilian


Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ