linux-kernel - Re: 3.6rc6 slab corruption.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1209191340410.6273@chino.kir.corp.google.com>
Date:	Wed, 19 Sep 2012 14:27:37 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
cc:	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	Dave Jones <davej@...hat.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Srivatsa Vaddagiri <vatsa@...ux.vnet.ibm.com>,
	Suzuki Poulose <suzuki@...ibm.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Jeremy Fitzhardinge <jeremy@...p.org>,
	linux-kernel@...r.kernel.org
Subject: Re: 3.6rc6 slab corruption.

On Wed, 19 Sep 2012, Linus Torvalds wrote:

> > Create a 350 processes reading /sys/kernel/debug/kvm/spinlocks/histo_blocked
> > file simultaneously in while loop for more than 3 hours on my box.
> 
> You need to open the file a single time, and then after that sinelg
> open (either threaded or with fork()) do multiple concurrent copies
> something like
> 
>    for (;;) {
>       char buf[1024];
>       lseek(fd, 0, SEEK_SET);

These are non-seekable files, so this will always fail.  That makes the 
race much more difficult to trigger: the read needs to call 
u32_array_read() with both threads finding *ppos == 0 and then race 
between the kfree() and resetting of file->private_data pointer.

 [ I'm surprised that Dave was able to trigger this so often that he has 
   800MB of log. ]

Anyway, I instrumented the kernel to open the race by sleeping after 
checking *ppos == 0 and immediately after the kfree() and I could 
reproduce the issue but with a "Object already free" error rather than a 
redzoning error.

I assumed this was because Dave didn't have a certain slub debug option 
enabled or redzoning was checked before double-free, but it turns out this 
should always be caught first.  For some reason the freed object is not 
being found on the partial slab's freelist.

>       read(fd, buf, sizeof(buf));
>    }
> 
> or similar. But it's important that they all share the same struct file.
> 
> It's also likely to make it easier to trigger the race if you have a
> kernel with preemption enabled.
> 
> And you need to have SLAB debugging enabled to actually *see* the
> messages. Otherwise you'll have just (possibly silent) corruption or a
> memory leak.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/