lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090928212838.GS12922@hexapodia.org>
Date:	Mon, 28 Sep 2009 14:28:38 -0700
From:	Andy Isaacson <adi@...apodia.org>
To:	Theodore Tso <tytso@....edu>, linux-kernel@...r.kernel.org,
	linux-ext4@...r.kernel.org
Subject: Re: hard lockup, followed by ext4_lookup: deleted inode referenced: 524788

On Mon, Sep 28, 2009 at 04:25:07PM -0400, Theodore Tso wrote:
> On Mon, Sep 28, 2009 at 12:16:44PM -0700, Andy Isaacson wrote:
> > After a hard lockup and reboot, my test box (running recent Linus git
> > 851b147) came up with:
> > 
> > [    5.016854] EXT4-fs (sda1): mounted filesystem with ordered data mode
> > [    8.809125] EXT4-fs (sda1): internal journal on sda1:8
> > [   10.165239] EXT4-fs error (device sda1): ext4_lookup: deleted inode referenced: 524788
> > [   10.165286] Aborting journal on device sda1:8.
> > [   10.168111] EXT4-fs error (device sda1): ext4_journal_start_sb: Detected aborted journal
> > [   10.168169] EXT4-fs (sda1): Remounting filesystem read-only
> > [   10.171614] EXT4-fs (sda1): Remounting filesystem read-only
> 
> It would be useful to see what pathname is associated with inode 524788.

debugfs:  ncheck 524788
Inode   Pathname
524788  /etc/rcS.d/S90mountdebugfs
debugfs:  cd /etc/rcS.d
debugfs:  ls
 532482  (12) .    16386  (12) ..    526699  (16) README
 524349  (28) S06keyboard-setup    524350  (24) S13pcmciautils
 524351  (20) S25brltty    524788  (52) S90mountdebugfs
 524354  (36) S37apparmor    524356  (24) S49console-setup
 524357  (20) S55urandom    524358  (28) S70screen-cleanup
 524359  (24) S70x11-common    524360  (3800) S75policykit
debugfs:  stat S90mountdebugfs
Inode: 524788   Type: regular    Mode:  0644   Flags: 0x80000
Generation: 1027808588    Version: 0x00000000:00000001
User:     0   Group:     0   Size: 0
File ACL: 0    Directory ACL: 0
Links: 0   Blockcount: 0
Fragment:  Address: 0    Number: 0    Size: 0
 ctime: 0x4ab94c5d:3dd78a78 -- Tue Sep 22 15:14:53 2009
 atime: 0x4ab949a2:58503f24 -- Tue Sep 22 15:03:14 2009
 mtime: 0x4ab94c5d:3dd78a78 -- Tue Sep 22 15:14:53 2009
crtime: 0x4ab7bf7b:2589fbd4 -- Mon Sep 21 11:01:31 2009
dtime: 0x4ab94c5d -- Tue Sep 22 15:14:53 2009
Size of extra inode fields: 28
EXTENTS:
(END)

> > 2. after a lockup the journal recovery should not fail.
> 
> I'm not sure it was a matter of the journal recovery failing.  All we
> know for certain is that filesystem was corrupted after the lockup and
> remounting the filesystem.  What caused the file system corruption is
> open to question at the moment; it could have been caused by the
> lockup; or it could have been a file that was deleted right about the
> time of the lockup; or it could have been some completely random
> filesystem corruption that.
> 
> It would be useful to know whether the inode was in question was
> supposed to have been deleted.  If it was, it would be useful to know
> if the dtime reported by debugfs's stat was around the time of the
> original lockup.  

/etc/init.d/mountdebugfs is also present in the directory but has dtime
set.  mountdebugfs is part of blktrace, which I installed ("apt-get
install blktrace") at least 20 minutes prior to the lockup; I'd
run blktrace, gotten incomprehensible errors, figured out I needed
CONFIG_FTRACE and the relevant block IO tracing options to use blktrace,
reconfigured the kernel, and was rebuilding when it hung.

Hmmm, /var/log/dpkg.log has mtime Sep 23 18:45:48 and is missing today's
activity (blktrace and ncurses5-dev).

/var/cache/apt/archive/blktrace_1.0.1-2_amd64.deb also has dtime set.

fsck.ext4 has a lot to say -- looks like something bad happened Sep 22
and there's been corruption lurking since.  I used the machine a fair
bit on 9/23 though (including installing some packages), and didn't see
any problems.

I've attached the complete output from "fsck -n /dev/sda1" and "stat
<%d>" on each inode reported to be deleted.

-andy

View attachment "fsck.out" of type "text/plain" (75660 bytes)

View attachment "debugfs-deleted-inodes.out" of type "text/plain" (63999 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ