lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 13 Jan 2012 17:43:33 -0500
From:	Vince Weaver <vweaver1@...s.utk.edu>
To:	<linux-kernel@...r.kernel.org>
CC:	<mingo@...e.hu>, <a.p.zijlstra@...llo.nl>,
	Paul Mackerras <paulus@...ba.org>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	Stephane Eranian <eranian@...il.com>
Subject: Re: perf_event hard locks 3.1.x


On Fri, 16 Dec 2011, Vince Weaver wrote:

> I had a PAPI user report that perf_event usage (such as running
> the PAPI tests) would cause hard lockups on his 3.1.x kernel
> (from ARCH linux).
> 
> After some tedious bisection of the .config file, I found that the issue
> happens when 
>   CONFIG_SLUB=y
>   CONFIG_SLUB_DEBUG=y
> is enabled.  Having a kernel with that enabled and stressing the 
> perf_event subsystem will eventually cause lockups or hard crashes.

I spent a lot of time trying to track this down, though the problem does 
not appear with stock 3.2.

The problem is still there with 3.1.9, but since that might be the last 
3.1.x kernel it might not matter anymore.

Summary of what I found:
  you need to have CONFIG_SLUB=y
  you can cause the crash by running the PAPI ctests 1-3 times
    (the program that causes the crash is different each time, so 
     no good workaround.  Probably a race condition).

I reverse-bisected the fix between 3.1 and  3.2 (that is to say, I 
bisected to find when the kernels stopped crashing) to this commit:

   commit a33caeb118198286309859f014c0662f3ed54ed4
   lockdep, kmemcheck: Annotate ->lock in lockdep_init_map()

but that fix is already in 3.1.9 and doesn't seem to fix things.

It might also be the next commit 

   commit ddf6e0e50723b62ac76ed18eb53e9417c6eefba7
   ftrace: Fix hash record accounting bug

as the panics in that report look similar to what I see before the 
machine quickly and massively dies, but when I apply this on top of 
3.1.9 it doesn't avoid the crash.

I suspect maybe I was chasing two separate problems, which is why the 
bisect didn't give the proper results.

In any case, I'm abandoning the search for now unless anyone has some 
other ideas I could try.

Thanks,

Vince

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ