linux-kernel - Re: 2.6.38.8 kernel bug in XFS or megaraid driver with heavy I/O load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20111011141338.GA11808@otto.nzcorp.net>
Date:	Tue, 11 Oct 2011 16:13:38 +0200
From:	Anders Ossowicki <aowi@...ozymes.com>
To:	Christoph Hellwig <hch@...radead.org>
CC:	<linux-kernel@...r.kernel.org>, <aradford@...il.com>,
	<xfs@....sgi.com>
Subject: Re: 2.6.38.8 kernel bug in XFS or megaraid driver with heavy I/O load

On Tue, Oct 11, 2011 at 03:34:48PM +0200, Christoph Hellwig wrote:
> This is core VM code, and operates purely on on-stack variables except
> for the page cache radix tree nodes / pages.  So this either could be a
> core VM bug that no one has noticed yet, or memory corruption.  Can you
> run memtest86 on the box?

Unfortunately not, as it is a production server. Pulling it out to memtest 256G
properly would take too long. But it seems unlikely to me that it should be
memory corruption. The machine has been running with the same (ecc) memory for
more than a year and neither the service processor nor the kernel (according to
dmesg) has caught anything before this. It would be a rare (though I admit not
impossible) coincidence if we got catastrophic, undetected memory corruption a
week after attaching a new raid controller with a new disk array.
-- 
Anders Ossowicki

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/