linux-kernel - Re: what fsck can (and can't) do was Re: [patch] ext2/3: document conditions when reliable operation is possible

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090903192725.GB7378@mit.edu>
Date:	Thu, 3 Sep 2009 15:27:25 -0400
From:	Theodore Tso <tytso@....edu>
To:	david@...g.hm
Cc:	Rob Landley <rob@...dley.net>, Pavel Machek <pavel@....cz>,
	Ric Wheeler <rwheeler@...hat.com>,
	Florian Weimer <fweimer@....de>,
	Goswin von Brederlow <goswin-v-b@....de>,
	kernel list <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...l.org>, mtk.manpages@...il.com,
	rdunlap@...otime.net, linux-doc@...r.kernel.org,
	linux-ext4@...r.kernel.org, corbet@....net
Subject: Re: what fsck can (and can't) do was Re: [patch] ext2/3: document
	conditions when reliable operation is possible

On Thu, Sep 03, 2009 at 09:56:48AM -0700, david@...g.hm wrote:
> from this discussin (and the similar discussion on lwn.net) there appears 
> to be confusion/disagreement over what fsck does and what the results of  
> not running it are.
>
> it has been stated here that fsck cannot fix broken data, all it tries to 
> do is to clean up metadata, but it would probably help to get a clear  
> statement of what exactly that means.

Let me give you my formulation of fsck which may be helpful.  Fsck can
not fix broken data; and (particularly in fsck -y mode) may not even
recover the maximal amount of lost data caused by metadata corruption.
(This is why sometimes an expert using debugfs can recover more data
than fsck -y, and if you have some really precious data, like ten
years' worth of Ph.D. research that you've never bothered to back
up[1], the first thing you should do is buy a new hard drive and make a
sector-by-sector copy of the disk and *then* run fsck.  A new
terrabyte hard drive costs $100; how much is your data worth to you?)

[1] This isn't hypothetical; while I was at MIT this sort of thing
actually happened more than once --- which brings up the philosophical
question of whether someone who is that stupid about not doing backups
on critical data *deserves* to get a Ph.D. degree.  :-)

Fsck's primary job is to make sure that further writes to the
filesystem, whether you are creating new files or removing directory
hierarchies, etc., will not cause *additional* data loss due to meta
data corruption in the file system.  Its secondary goals are to
preserve as much data as possible, and to make sure that file system
metadata is valid (i.e., so that a block pointer contains a valid
block address, so that an attempt to read a file won't cause an I/O
error when the filesystems attempts to seek to a non-existent sector
on disk).

For some filesystems, invalid, corrupt metadata can actually cause a
system panic or oops message, so it's not necessarily safe to mount a
filesystem with corrupt metadata read-only without risking the need to
reboot the machine in question.  More recently, there are folks who
have been filing security bugs when they detect such cases, so there
are fewer examples of such cases, but historically it was a good idea
to run fsck because otherwise it's possible the kernel might oops or
panic when it tripped over some particularly nasty metadata corruption.

> but if a fsck does not get run on a filesystem that has been damaged, 
> what additional damage can be done?

Consider the case where there are data blocks in use by inodes,
containing precious data, but which are marked free in a filesystem
allocation data structures (e.g., ext3's block bitmaps, but this
applies to pretty much any filesystem, whether it's xfs, reiserfs,
btrfs, etc.).  When you create a new file on that filesystem, there's
a chance that blocks that really contain data belonging to other
inodes (perhaps the aforementioned ten years' of unbacked-up
Ph.D. thesis research) will get overwritten by the newly created file.

Another example is an inode which has multiple hard links, but the
hard link count is wrong by being too low.  Now when you delete one of
the hard links, the inode will be released, and the inode and its data
blocks returned to the free pool, despite the fact that it is still
accessible via another directory entry in the filesystem, and despite
the fact that the file contents should be saved.

In the case where you have a block which is claimed by more than one
file, if that file is rewritten in place, it's possible that the newly
written file could have its data corrupted, so it's not just a matter
of potential corruption to existing files; the newly created files are
at risk as well.

> can it overwrite data that could have been saved?
>
> can it cause new files that are created (or new data written to existing, 
> but uncorrupted files) to be lost?
>
> or is it just a matter of not knowing about existing corruption?

So it's yes to all of the above; yes, you can overwrite existing data
files; yes it can cause data blocks belonging to newly created files
to be list; and no you won't know about data loss caused by metadata
corruption.  (Again, you won't know about data loss caused by
corruption to the data blocks.)

					- Ted

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/