linux-ext4 - Re: [PATCH] e2fsck: track errors/badness found for each inode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20180623135218.GA4852@thunk.org>
Date:   Sat, 23 Jun 2018 09:52:18 -0400
From:   "Theodore Y. Ts'o" <tytso@....edu>
To:     Artem Blagodarenko <artem.blagodarenko@...il.com>
Cc:     linux-ext4@...r.kernel.org, adilger.kernel@...ger.ca,
        alexey.lyashkov@...il.com,
        Andreas Dilger <andreas.dilger@...el.com>,
        Girish Shilamkar <girish@...sterfs.com>
Subject: Re: [PATCH] e2fsck: track errors/badness found for each inode

On Fri, Jun 22, 2018 at 02:52:38PM +0300, Artem Blagodarenko wrote:
> From: Andreas Dilger <andreas.dilger@...el.com>
> 
> The present e2fsck code checks the inode, per field basis.  It
> doesn't take into consideration to total sanity of the inode.
> This may cause e2fsck turning a garbage inode into an apparently
> sane inode ("It is a vessel of fertilizer, and none may abide
> its strength.").
> 
> The following patch adds a heuristics to detect the degree of
> badness of an inode. icount mechanism is used to keep track of
> the badness of every inode.  The badness is increased as various
> fields in inode are found to be corrupt.  Badness above a certain
> threshold value results in deletion of the inode.  The default
> badness threshold value is 7, it can be specified to e2fsck
> using "-E inode_badness_threshold=<value>"

The main thing I don't like about this patch, as I've stated before,
is that it spreads out the badness calculation all over a large amount
of code.  If we want to do something like this, what I think makes
more sense is to have a libext2fs function which calculates a badness
score, and which is called in pass1 --- and if it exceeds some
threshold, we can offer to clear it right then and there.

However, I had already implemented a while back something which
evaluates the sanity of the inode table block on a holistic fashion
--- see the check_inode_block_sanity() function in lib/ext2fs/inode.c.
I think this is a superior approach since the primary way that e2fsck
gets insane inodes is when an inode table block is scrambled.  At the
moment the code only evaluate inode sanity by looking at i_blocks.

So if you have evidence that the current hueristics in the inode_scan
code is not sufficient, what I would propose is this.  We can create a
new function, ext2fs_inode_check_sanity() which evaluates an inode and
returns a badness value.  We can then use it in the inode_scan
functions with a lower threshold, and if more than 50% of the inodes
are bad, we declare the entire inode table bad.  We can also use a
higher threshold in pass 1, where if the sanity exceeds that
threshold, we can offer to the user to clear the inode up front.

Fair enough?

>  tests/f_ind_inode_collision/expect.2 |  18 ++++-

By the way, this is something which indicates that proposed patch is
not ready for submission.  The standard we set for e2fsck is that it
should repair all file system damage in a single pass.  If the
resulting file system after and e2fsck -fy pass results in corruptions
detected during a second pass of e2fsck, that's considered a bug that
should be fixed.  So to the extent that e2fsck can no longer fix all
of the corruptions in the f_ind_inode_collision test, that's a
regression.

Cheers,

					- Ted