lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20081223003051.GA26813@one.firstfloor.org>
Date:	Tue, 23 Dec 2008 01:30:51 +0100
From:	Andi Kleen <andi@...stfloor.org>
To:	Vegard Nossum <vegard.nossum@...il.com>
Cc:	Brad Campbell <brad@...p.net.au>,
	Manfred Spraul <manfred@...orfullife.com>,
	Andi Kleen <andi@...stfloor.org>,
	lkml <linux-kernel@...r.kernel.org>
Subject: Re: BUG() in 2.6.28-rc8-git2 under heavy load

> 1. The CPU reported the wrong faulting instruction (seems highly

I remember spending quite some time on a report a few years ago
and in the end decided the CPU in that case was reporting incorrect
fault addresses too. iirc we blamed it on overheating or some
unspecified hardware damage.

> unlikely, since that means it wouldn't be able to resume properly in
> other situations),
> 2. We really were executing at a slightly strange (offset) EIP
> 
> I'm going for #2. But how could it happen? Did the caller supply a
> wrong address in its CALL? It seems unlikely. Why would it happen only
> for this function, four times in a row, at the exact same location?
> Was the caller's code corrupted?

There are a couple of situations: someone corrupted a pointer 
on the stack or in a structure containing function pointers.

On x86-64 there's another trap that if you call a function
that is declared stdargs ... through a prototype that doesn't
contain ... it can also jump to random addresses due to the
way gcc handles stdargs. Normally we have very few stdargs
functions in the kernel so it's unlikely, but I've seen
the problem in userland.

If it's reproducible one way to trace it down would be to enable
LBR (I got some old patches for that that could be adapted), but then 
that would only tell you the caller.

-Andi
-- 
ak@...ux.intel.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ