linux-ext4 - Re: 2.6.25-git2: BUG: unable to handle kernel paging request at ffffffffffffffff

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.1.10.0804211030450.2779@woody.linux-foundation.org>
Date:	Mon, 21 Apr 2008 10:48:08 -0700 (PDT)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Jiri Slaby <jirislaby@...il.com>
cc:	"Rafael J. Wysocki" <rjw@...k.pl>,
	LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-ext4@...r.kernel.org,
	Herbert Xu <herbert@...dor.apana.org.au>,
	"Paul E. McKenney" <paulmck@...ibm.com>,
	"David S. Miller" <davem@...emloft.net>
Subject: Re: 2.6.25-git2: BUG: unable to handle kernel paging request at
 ffffffffffffffff

On Mon, 21 Apr 2008, Jiri Slaby wrote:
> 
> BTW. I haven't see this without suspend/resume cycle, do you, Rafael? It
> doesn't mean anything, since it needs longer time to trigger, but anyway, it
> might be a clue.

There's a separate (and very different-looking) bug-report about the atl1 
driver having problems when doing an "ifconfig down" on it. In fact, the 
problem report says:

> With this commit in tree, I can reproduce either
> a) kmalloc-2048 corruption after initscripts shutdown eth0
>         http://marc.info/?l=linux-kernel&m=120820360221261&w=2
> 
> b) or oopses at filp_close() first reported long ago
>         (sorry, can't find that email)

where that "or oopses at filp_close()" thing is somewhat interesting, 
since your original bug was about something that looked like file pointer 
corruption.

Now, I doubt you have an ATL chip, and I doubt the two are _really_ 
related in any way (the ATL bug was actually triggered by enabling 64-bit 
DMA), but the filp_close thing makes me go "hmm".

The two affected corrupted SLUB areas were the 2kB allocation (1560-byte 
ethernet packets plus skb_shared_info overhead, anyone?) and apparently 
the one that filp's are in (perhaps a 20-byte TCP ACK packet or other 
"small" packet + the skb_shared_info overhead would be a common case that 
might be in that 200-byte range?)

Maybe the ATL bug isn't ATL-specific at all, but somehow connected to 
NETIF_F_HIGHDMA. Do you have 4GB+ of RAM?

And one thing that suspend/resume does, which is not necessarily commonly 
done during normal operation, is that ifconfig down/up pattern. Maybe 
there is something broken in general there?

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html