linux-kernel - Re: [GIT PULL] x86/cpu changes for v2.6.34

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1267472522.10871.14.camel@gandalf.stny.rr.com>
Date:	Mon, 01 Mar 2010 14:42:02 -0500
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...e.hu>,
	Thomas Gleixner <tglx@...utronix.de>,
	linux-kernel@...r.kernel.org, "H. Peter Anvin" <hpa@...or.com>,
	Borislav Petkov <borislav.petkov@....com>,
	Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [GIT PULL] x86/cpu changes for v2.6.34

On Mon, 2010-03-01 at 08:47 -0800, Linus Torvalds wrote:

> Both of you seemed to miss the fact that it's not cpu7 that is 
> particularly slow. See the original email from me in this thread: the jump 
> was at some random point:
> 
>         [    0.245179] CPU 1 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
>         [    0.265332]  #2
>         [    0.353185] CPU 2 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
>         [    0.373328]  #3
>         [    2.193277] CPU 3 MCA banks CMCI:2 CMCI:3 CMCI:5 SHD:6 SHD:8
>         [    2.213379]  #4
> 
> and the reason I grepped for "CPU 7" was that it's the _last_ CPU on this 
> machine, so what I was grepping for was basically "how long did it take to 
> bring up all CPU's".
> 
> So that particular really bad case apparently happened for CPU#3, but the 
> two other slow cases happened for CPU#4.
> 
> Also, it seems to happen only about every fifth boot or so. Suggestions 
> for something simple that can trace things like that?

As Frederic has said you can use 'ftrace=function_graph' on the kernel
command line. It will be initialized in early_initcall (which I believe
is before CPUs are set up. Then add a tracing_off() after the trouble
code. You can make the trace buffers bigger with the kernel command
line:

	trace_buf_size=10000000

The above will make the trace buffer 10Meg per CPU. Unlike the
"buffer_size_kb" file, this number is in bytes, even though it will
round to the nearest page. (I probably should make this into kb, and
rename it to trace_buf_size_kb, and deprecate trace_buf_size).

Then you can cat out /debug/tracing/trace, and search for large
latencies in the timestamps.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/