lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1373999590.6458.34.camel@gandalf.local.home>
Date:	Tue, 16 Jul 2013 14:33:10 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Zhang Rui <rui.zhang@...el.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH] Thermal: Fix lockup of cpu_down()

On Tue, 2013-07-16 at 11:19 -0700, Srinivas Pandruvada wrote:
> Thanks. How did you trigger this error condition? Is it a code review or 
> you have some way to reproduce?

No, my tests do a cpu hotplug stress and the system would hang. I had to
bisect it to find the bug and it came to this code. What was weird is
that the module wasn't loaded. Then I ran the ftrace function tracer
stared by the kernel command line with the following:

 ftrace=function ftrace_filter=get_online_cpus,put_online_cpus

and after I booted up, I ran:

cat /debug/tracing/trace | perl -e '
my @stack;
while (<>) {
	if (/get_online/) {
		push @stack, $_;
	} elsif (/put_online/) {
		pop @stack;
	}
}
foreach my $line (@stack) {
	print $line;
}'

And it showed that get_online_cpus() was called twice without a matching
put_online_cpu(). The strange thing was the calls had no parent
function. Which is when I realized that the module was loaded but then
failed to init, and was unloaded. Which explains why it didn't show up
in my lsmod.

Then it was just the matter of looking at all the calls to
get_online_cpu() in the commit, and it was rather obvious to what the
bug was.

With the patch applied, the lockup went away.

-- Steve



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ