[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140623173151.GD14887@thunk.org>
Date: Mon, 23 Jun 2014 13:31:51 -0400
From: Theodore Ts'o <tytso@....edu>
To: Killian De Volder <killian.de.volder@...rlet.be>
Cc: linux-ext4@...r.kernel.org
Subject: Re: Recovery after mkfs.ext4 on a ext4
On Mon, Jun 23, 2014 at 06:37:20PM +0200, Killian De Volder wrote:
> On 23-06-14 14:37, Theodore Ts'o wrote:
> > On Mon, Jun 23, 2014 at 08:09:37AM +0200, Killian De Volder wrote:
> >> It's still checking due to the high amount of ram it's using.
> >> However if I start a parallel check with -nf if find other errors the one with the high memory usage hasn't found yet ?
> > No, definitely not that! Running two e2fsck's in parallel will do far
> > more harm than good.
> In parallel is a big word: the check repair is SOOO slow, it might as well been killed when the second (read-only) test is done.
> I once has a OOM because of tomuch ZRAM allocated, after I restarted e2fsck, it found more error before going into massive ram-usage.
> So I was wonder what would happen if I restarted it.
> >
> >> Should I start a new one, or is this not advised ?
> >> As sometimes I think it's bad inodes causing artificial usage of memory.
> > What part of the e2fsck run are you in? If you are in passes
> > 1b/1c/1d, then one of the things you can do is to analyze the log
> Pass 1: Checking inodes, blocks, and sizes
> Notthing else below this except things like:
>
> Too many illegal blocks in inode 488.
> Clear inode<y>? yes
Does it stop after one of these messages without displaying anything
else? Or does it just continue emitting a large number of these
messages? And is the time between each one getting longer and longer?
We do actually keep a linked list of these inode numbers so we can try
to report a directory name so you know which file has been trashed.
This happens in pass #2, so the inodes which are invalid are stored in
pass #1 and only removed in pass #2.
So if you are seeing gazillions of bad inodes, that could very easily
be what's going on. If so, I can imagine having some mode that we
enter after a hundred inodes where we just ask permission to blow away
all of the corrupted inodes in pass #1, without waiting until we can
give you a proper pathname.
The other possibility is that a particular indode is so badly
corrupted that we're looping trying to evaluate a particular inode.
That's why I'm asking if e2fsck is has just stopped and not printing
any more messages, in what might be an apparent infinite loop.
- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists