lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 25 Oct 2013 17:36:15 -0400 (EDT)
From:	Vince Weaver <vincent.weaver@...ne.edu>
To:	linux-kernel@...r.kernel.org
cc:	Steven Rostedt <rostedt@...dmis.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
	Dave Jones <davej@...hat.com>,
	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: perf/ftrace lockup on 3.12-rc6 with trigger code

On Fri, 25 Oct 2013, Vince Weaver wrote:

> 
> I'm not sure how tracepoints work exactly, but the problem code is setting
> 	pe[5].type=PERF_TYPE_TRACEPOINT;
> 	pe[5].config=0x7fffffff00000001;
> 
> The config is being truncated to 32-bits by the perf/ftrace code so I 
> think this means the tracepoint being enabled is
> 
> 	tracing/events/ftrace/function/id:1
> 

I've wasted much of the day playing with this and adding printks, etc.

The key things that cause the problem are:

  tracepoint event
  config is 1 (tracing/events/ftrace/function)
  PERF_SAMPLE_PERIOD set in the sample type
  no user mmap buffer
  period (not frequency) enabled

What this means is there is pretty high number of kernel calls happening.
Every kernel function entry ends up calling the
perf_swevent_overflow() overflow handler, which calls perf_event_output()
which attempts to dump the buffer, but it can't because no user buffer is 
mmap'd.

This causes some sort of storm and eventually the system just stops 
responding and the watchdog kicks in, although the traces it gives back 
are different each time.  

It's possible the kernel is making forward progress (though very slowly)
and this is just some sort of throttling issue.

I don't know if there are any better ways to try to debug things than the 
printk route.  Though that has its own problems as the printk's themselves 
start showing up in the ftrace traces.

Vince

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ