linux-ext4 - Re: allowing ext4 file systems that wrapped inode count to continue working

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c1c34d2e-4d31-3a43-4a77-a9c02f8465f2@uls.co.za>
Date:   Mon, 30 Jul 2018 10:56:01 +0200
From:   Jaco Kroon <jaco@....co.za>
To:     "Theodore Y. Ts'o" <tytso@....edu>
Cc:     Andreas Dilger <adilger@...ger.ca>, Jan Kara <jack@...e.cz>,
        linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: allowing ext4 file systems that wrapped inode count to continue
 working

Hi Ted,

I actually after some googling found (deduced) the same process.

On 28/07/2018 20:19, Theodore Y. Ts'o wrote:
> On Sat, Jul 28, 2018 at 03:47:18PM +0200, Jaco Kroon wrote:
>> Just a note.  In order to be able to utilize dumpe2fs I had to apply the
>> patch from my first email.  The only utility that would start up (and
>> fail) was fsck.  It should be noted that I've hacked s_inodes_count to
>> 0xFFFFFFFF the first time we encountered this (with assistance from Jan).
>>
>> Group 524287: (Blocks 17179836416-17179869183) csum 0xe0ea
>> [INODE_UNINIT, ITABLE_ZEROED]
>>   Group descriptor at 17179836416
>>   Block bitmap at 17179344911 (bg #524272 + 15)
>>   Inode bitmap at 17179344927 (bg #524272 + 31)
>>   Inode table at 17179352608-17179353119 (bg #524272 + 7712)
>>   12 free blocks, 8192 free inodes, 0 directories, 8192 unused inodes
>>   Free blocks: 17179838452-17179838463
>>   Free inodes: 4294959105-4294967296
>>
>> If I read that correctly none of the inodes are in use, but there are
>> some data blocks in use (in fact, most of it).
>>
>> Suggestions/pointers on how to get those blocks re-allocated elsewhere?
> I'd suggest trying to see if you can use debugfs's icheck command to
> take a block number (such as 17179353120, for example, just beyond the
> inode table) and map it to an inode number:
>
> debugfs: icheck 17179353120
So to get in-use blok numbers I used:

debugfs/debugfs /dev/lvm/oldhome -R "testb 17179836416 32768" >
group524287blocks.txt

This runs reasonably quickly at around 12 minutes.  The icheck command
can take multiple block numbers, but whether it's faster than doing one
at a time I don't know.  Passing all of them results in a command line
exceeding 64KB and thus fails to execute.  If speed here can be
confirmed I could pick some sequence mechanism in order to run multiple
blocks at a time and get the CLI in just under 64KB.  First block
(17179836416) is marked in use, but icheck can't find it (group
descriptor block - realized I wasted a lot of time on that one,
suggestion:  icheck should first check if it's the GDP for the
containing block and output to that effect, icheck could also like testb
take a block sequence, eg, start num, but since it already takes
multiple block numbers that's unless care is taken not a backwards
compatible change, perhaps syntax of <block number> [+<number of
blocks>] ... ).  Quick check of code it looks like it should be faster
to scan for multiple blocks at a time.

As per above there was 12 free blocks ... at least there has been progress:

# cat group524287blocks.txt | sed -re 's/^Block [0-9]+ //' | sort | uniq -c
  31397 marked in use
   1371 not in use

icheck however is slow on the current run.  Previous run was around 25
minutes (I assume since it just got lucky and found the right inode
quickly).  Been going >12 hours on the current run.
>
> This will give you some inode number, perhaps 156789 (for example).
> You can then take the inode number and map it to a pathname:
>
> debugfs: ncheck 156789
>
> You can then mount the file system, copy that file off, and then
> delete it.  Then see what blocks are left as in use.  Lather, Rinse,
> Repeat.  :-)
This one took about 45 minutes on the previous run.

Is there any way to mark those blocks that's being freed to not be
re-used?  I was contemplating setting them as badblocks using fsck so
that I can online the filesystem in cycles so that I can get backups to
function overnight, when they are done in the morning, offline and
perform the next cycle?  It looks like debugfs's setb is what I'm after
possibly rather (I'll just need to keep track of which blocks I've
marked this way, and which are legitimately in use, so badblocks is
perhaps the better way)?  This will result in fsck complaining that an
unused block is marked in use, but since fsck is non-functional at the
moment anyway that thought doesn't particularly haunt me.

For anyone who ran into the same problem - if you've got adequate space
to create a new filesystem and move all of your data, do that.  tar,
rsync and cp all have mechanisms to cope with hard links, as well as
sparse files.  Use the tool you know best, and make sure that you enable
those options if they are not on by default.

Kind Regards,
Jaco