linux-kernel - Re: [OOPS s390] Unable to handle kernel pointer dereference at virtual kernel address (null)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110418115141.GA3157@osiris.boeblingen.de.ibm.com>
Date:	Mon, 18 Apr 2011 13:51:41 +0200
From:	Heiko Carstens <heiko.carstens@...ibm.com>
To:	Jan Glauber <jang@...ux.vnet.ibm.com>
Cc:	Jonathan Nieder <jrnieder@...il.com>, linux-s390@...r.kernel.org,
	Stephen Powell <zlinuxman@...way.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [OOPS s390] Unable to handle kernel pointer dereference at
 virtual kernel address (null)

On Mon, Apr 18, 2011 at 10:45:11AM +0200, Jan Glauber wrote:
> On Fri, Apr 15, 2011 at 08:48:40PM -0500, Jonathan Nieder wrote:
> > Hi,
> > 
> > Here's an oops that was reported to Debian[1].  It cannot be
> > reproduced on demand but it is reproducible with enough time.  It did
> > not appear on v2.6.32; it does appear on Debian 2.6.38-3 (which is
> > based on gregkh's v2.6.38.2) and pristine v2.6.39-rc3, so looks like
> > a regression.

It's probably easily reproducible if you put enough memory pressure on
the whole vm system, since this triggers a bug a in the pfault code.

> > > [ 2698.053263] Kernel panic - not syncing: Fatal exception in interrupt
> > > [ 2698.053316] CPU: 0 Tainted: G      D      2.6.38-2-s390x #1
> > > [ 2698.053502] Process cron (pid: 1106, task: 000000001f962f78, ksp: 000000001fa0f9d0)
> > > [ 2698.053516]        0000000000000000 000000001ff7fa70 0000000000000002 0000000000000000
> > > [ 2698.053539]        000000001ff7fb10 000000001ff7fa88 000000001ff7fa88 0000000000397b9e
> > > [ 2698.053576]        0000000000000001 0000000000000000 000000001ff03280 0000000000000000
> > > [ 2698.053623]        0000000000000008 0000000000000000 000000000000000e 0000000000000078
> > > [ 2698.053674]        000000001ff7faf0 0000000000011b36 000000001ff7fa70 000000001ff7fab8
> > > [ 2698.053740] Call Trace:
> > > [ 2698.053762] ([<0000000000011a60>] show_trace+0x5c/0xa4)
> > > [ 2698.053801]  [<00000000003979de>] panic+0x9e/0x214
> > > [ 2698.054443]  [<0000000000012046>] die+0x15e/0x170
> > > [ 2698.054485]  [<000000000002c5d6>] do_no_context+0xd6/0xe0
> > > [ 2698.054529]  [<000000000002cd52>] do_protection_exception+0x46/0x2a0
> > > [ 2698.054577]  [<000000000039b208>] pgm_exit+0x0/0x4
> > > [ 2698.054627]  [<000000000002c03e>] pfault_interrupt+0xa2/0x138
> > > [ 2698.054679] ([<000000001f962f78>] 0x1f962f78)
> > > [ 2698.056408]  [<000000000001acda>] do_extint+0xf6/0x138
> > > [ 2698.056424]  [<000000000039b6ca>] ext_no_vtime+0x30/0x34
> > > [ 2698.056439]  [<000000007d706e04>] 0x7d706e04
> > > HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 0001DE26
> > [...]
> > 
> > > On Thu, 14 Apr 2011 21:48:56 -0400 (EDT), Stephen Powell wrote:
> > 
> > >> The problem appears to be fixed in the latest vanilla upstream kernel
> > >> source, which at the time of this writing is 2.6.39-rc3.
> > >> ...
> > >
> > > Oops!  I spoke too soon.  I checked the server before I went to bed
> > > last night, and it was still up at that time; but when I got up this
> > > morning I checked it again, and it had crashed during the night with
> > > the same protection exception at the same offset in the same function.
> > > That's the trouble with these kind of bugs.
> > 
> > Ideas?

That's a bug in the pfault interrupt code. After a cleanup patch which
simplified lowcore accesses we are left with a dereference which shouldn't
be there.
The patch below should fix it. The bug was introduced with 2.6.37-rc1.

diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index 9217e33..4cf85fe 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -558,9 +558,9 @@ static void pfault_interrupt(unsigned int ext_int_code,
 	 * Get the token (= address of the task structure of the affected task).
 	 */
 #ifdef CONFIG_64BIT
-	tsk = *(struct task_struct **) param64;
+	tsk = (struct task_struct *) param64;
 #else
-	tsk = *(struct task_struct **) param32;
+	tsk = (struct task_struct *) param32;
 #endif
 
 	if (subcode & 0x0080) {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/