linux-kernel - Re: perf/ftrace lockup on 3.12-rc6 with trigger code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.10.1310251726001.31246@vincent-weaver-1.um.maine.edu>
Date:	Fri, 25 Oct 2013 17:36:15 -0400 (EDT)
From:	Vince Weaver <vincent.weaver@...ne.edu>
To:	linux-kernel@...r.kernel.org
cc:	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	Dave Jones <davej@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: perf/ftrace lockup on 3.12-rc6 with trigger code

On Fri, 25 Oct 2013, Vince Weaver wrote:

> 
> I'm not sure how tracepoints work exactly, but the problem code is setting
> 	pe[5].type=PERF_TYPE_TRACEPOINT;
> 	pe[5].config=0x7fffffff00000001;
> 
> The config is being truncated to 32-bits by the perf/ftrace code so I 
> think this means the tracepoint being enabled is
> 
> 	tracing/events/ftrace/function/id:1
> 

I've wasted much of the day playing with this and adding printks, etc.

The key things that cause the problem are:

  tracepoint event
  config is 1 (tracing/events/ftrace/function)
  PERF_SAMPLE_PERIOD set in the sample type
  no user mmap buffer
  period (not frequency) enabled

What this means is there is pretty high number of kernel calls happening.
Every kernel function entry ends up calling the
perf_swevent_overflow() overflow handler, which calls perf_event_output()
which attempts to dump the buffer, but it can't because no user buffer is 
mmap'd.

This causes some sort of storm and eventually the system just stops 
responding and the watchdog kicks in, although the traces it gives back 
are different each time.  

It's possible the kernel is making forward progress (though very slowly)
and this is just some sort of throttling issue.

I don't know if there are any better ways to try to debug things than the 
printk route.  Though that has its own problems as the printk's themselves 
start showing up in the ftrace traces.

Vince

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/