linux-kernel - Re: [3.10] Oopses in kmem_cache_allocate() via prepare

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+55aFxmVteJApiC5e=d9NmYO6PSbNertq5ipLpunmVreCTYUQ@mail.gmail.com>
Date:	Tue, 26 Nov 2013 15:16:09 -0800
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	Simon Kirby <sim@...tway.ca>
Cc:	Ian Applegate <ia@...udflare.com>,
	Al Viro <viro@...iv.linux.org.uk>,
	Christoph Lameter <cl@...two.org>,
	Pekka Enberg <penberg@...nel.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Chris Mason <chris.mason@...ionio.com>
Subject: Re: [3.10] Oopses in kmem_cache_allocate() via prepare_creds()

On Mon, Nov 25, 2013 at 4:44 PM, Simon Kirby <sim@...tway.ca> wrote:
>
> I was hoping this or something else by 3.12 would have fixed it, so after
> testing we deployed this everywhere and turned off the rest of the debug
> options. I missed slub_debug on one server, though...and it just hit
> another case of overwritten poison.

Your thing is *very* consistent, it's once more four bytes into that
pipe-info. And it's once more that exact same "increment second word
in the allocation" pattern.

But the fact that it goes away when you enable other debug options is
really really annoying. It's consistent only if you don't have the
options on that might help us debug it further. Damn.

> Is it true that with slub_debug, aliasing of equal-sized objects is
> turned off, and so they shouldn't be immediately side-by-side? In other
> words, would there be similar scrawling victim chances as allocating
> pipe_inode_info with pages instead of slabs? "slabinfo -a" is empty.

So the thing is, with slub debugging, slub shouldn't be merging
different slab caches.

HOWEVER.

The pipe-info structure isn't using its own slab cache, it's just
using "kmalloc()". So it by definition will merge with all other
kmalloc() allocations of the same size (or, to be exact, of "similar
enough size to hit the same size bucket"). In your case it's the
192-byte-sized bucket.

But all the debugging code talks purely about pipe_info allocations -
both the previous kmalloc/kfree _and_ the kmalloc() that actually sees
the slub debugging error. So if it's mixing with something else, I'm
not seeing what that would be. It would have to be an older allocation
(as it "it got re-allocated to a pipe in between") or another type
that was the similar size.

Which doesn't look all that likely. Not when your problems are so
consistent, and seem to be *always* about that pipe_inode_info.

But dammit, that it such a simple set of allocations. I still don't
see how they could be to blame. And if it's some suspicious access to
the pipe mutex (that second word is still the "wait_lock" spinlock in
the pipe inode mutex) I really would have expected the mutex debugging
to have screamed loudly. Or the DEBUG_PAGEALLOC.

I'm really not very happy with the whole pipe locking logic (or the
refcounting we do, separately from the "struct inode"), and in that
sense I'm perfectly willing to blame that code for doing bad things.
But the fact that it all goes away with debugging makes me very very
unhappy.

                Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/