linux-ext4 - Re: Recover from a "deleted inode referenced" situation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Sat, 14 Oct 2017 18:16:14 -0700
From:   Kilian Cavalotti <kilian.cavalotti.work@...il.com>
To:     Andreas Dilger <adilger@...ger.ca>
Cc:     linux-ext4@...r.kernel.org
Subject: Re: Recover from a "deleted inode referenced" situation

Hi Andreas,

On Fri, Oct 13, 2017 at 11:40 AM, Andreas Dilger <adilger@...ger.ca> wrote:
> The findsuper comment was potentially misleading, as I was mixing up your
> problem with one in another thread where the partition table was clobbered.

Gotcha. Learned about it though, so it was still useful. ;)

>>      1024           0  17983724322816   95590400  4096    0  Thu Nov
>> 26 23:22:42 2015 12c2c019 1.42.6-5644
>
> This is likely to be the proper superblock, but it has a bit of a
> strange label.  Looks like a really old e2fsprogs build version?

Yes, the filesystem label name is for some reason the version of mkfs
that created it. The volume is from a Synology NAS, I assume that's
how they do things.

> No, this looks like the _start_ of a filesystem image, but there is
> no real guarantee that the blocks in the file are allocated contiguously
> in the actual filesystem, so your "dd" is unlikely to work properly.
> The filesystem itself is 773376 * 4KB ~= 3GB in size, and if it was
> originally created as a sparse file there is little chance those blocks
> were allocated contiguously.  The findsuper utility is meant to locate
> superblocks in a block device to help recover from partition table woes.

Aaah, right, got it.

> If you still have access to some of the files, you should consider to
> copy them out of the filesystem.  Next, I would recommend to make an
> LVM snapshot and run e2fsck on that to see what else you get out of it.
> Depending on the amount and type of corruption, that may take a very
> long time on a 20TB filesystem, and not be worthwhile to wait for.

I did that actually: I was able to salvage about 2.3 TB of data that
was still accessible from a read-only mount. Then, I created a
snapshot (although with mdraid, not LVM, but same idea), and fsck'ed
the filesystem: fsck found about 800GB more, but not many intact
directories. I was hoping that some of the top directories that got
lost could be found relatively intact in lost+found/ but I ended up
with a pretty flat hierarchy of things there, with directories at most
maybe 3-4 levels deep. I'll need to spend some time trying to put
pieces together from what fsck recovered.

But unfortunately there's another ~17TB of data that fsck didin't
find. That seems like a lot of data lost from just replaying a
corrupted journal... :(

Cheers,
-- 
Kilian