lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110607103252.GA15140@infradead.org>
Date:	Tue, 7 Jun 2011 06:32:52 -0400
From:	Christoph Hellwig <hch@...radead.org>
To:	Drunkard Zhang <gongfan193@...il.com>
Cc:	Alex Elder <aelder@....com>, xfs-masters@....sgi.com,
	xfs@....sgi.com, linux-kernel@...r.kernel.org
Subject: Re: bug in xfs: can't recovery metadata log

On Tue, Jun 07, 2011 at 01:20:23PM +0800, Drunkard Zhang wrote:
> The log recovery failure happened after a hard reboot, I did "mount
> /dev/lg/log /mnt/temp/" twice, but the similar dmesg error.
> 
> The xfs lives on LVM, with 4x2TB SATA II disk.
> 
> The first time:
> [ 1479.130446] XFS mounting filesystem dm-0
> [ 1479.226525] Starting XFS recovery on filesystem: dm-0 (logdev: internal)
> [ 1506.217842] BUG: unable to handle kernel NULL pointer dereference
> at 00000000000000f8

[...]

> [ 1506.220989] RIP: 0010:[<ffffffff81235f9c>]  [<ffffffff81235f9c>]
> xfs_cmn_err+0x6b/0x92


[...]

> [ 1506.226301]  [<ffffffff8122922b>] ? kmem_zone_zalloc+0x1f/0x30
> [ 1506.226549]  [<ffffffff812098b5>] xfs_error_report+0x39/0x40
> [ 1506.226805]  [<ffffffff811e8340>] ? xfs_free_extent+0x8e/0xae
> [ 1506.227056]  [<ffffffff811e75cf>] xfs_free_ag_extent+0x3e7/0x70b
> [ 1506.227306]  [<ffffffff811e8340>] xfs_free_extent+0x8e/0xae

It looks like you hit one of the XFS_WANT_CORRUPTED_GOTO checks in
xfs_error_report, and we hit something in there that isn't initialized
that early during the mount process.  My guess it's actually the
mp->m_fsname dereference in xfs_fs_vcmn_err.  It's fixed by the message
rework in 2.6.39+, but that will only prevent the crash, you'll still
get an error and the log recovery will be aborted.  If you can get a
more recent kernel on the box I'd be curious what the output form it is.

Did you run older kernels on this machine before?  Before 2.6.33 device
mapper support for barriers (aka cache flushes) was incomplete and
frequently led to free space corruption if people left the volatile
write caches on.  For MD underneath it event took a bit longer.

If you just want to continue using the filesystem you can nuke the
log using xfs_repair -L.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ