lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140219202443.GK6835@laptop.programming.kicks-ass.net>
Date:	Wed, 19 Feb 2014 21:24:43 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	Will Deacon <will.deacon@....com>,
	Drew Richardson <drew.richardson@....com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Arnaldo <acme@...hat.com>, Pawel Moll <Pawel.Moll@....com>,
	Wade Cherry <Wade.Cherry@....com>
Subject: Re: Perf Oops on 3.14-rc2

On Wed, Feb 19, 2014 at 08:59:08PM +0100, Stephane Eranian wrote:
> On Wed, Feb 19, 2014 at 7:36 PM, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Wed, Feb 19, 2014 at 07:03:13PM +0100, Stephane Eranian wrote:
> >> I am trying to understand the context here.
> >> Are you saying, we may call an offline CPU?
> >
> > Yes, that is what's happening.
> >
> >> I saw that sometimes you retry, sometimes you don't.
> >
> > I tried to do exactly what we do for the task case which is far more
> > likely to fail. Could be I messed up.
> >
> I am not sure why you need to retry. If the CPU is offline, it is offline.
> Or are you saying, you get an error, but you don't know the exact
> reason, thus you keep trying? But how do you get out of this if
> the CPU stays offline?

Ah, so take perf_remove_from_context() as before the patch; if the
cpu_function_call() fails because the CPU is offline, it doesn't call
list_del_event().

Now the offline function is supposed to take them off the list, but it
doesn't actually in case they're grouped.

This leaves a free()d event on the offline cpu's context list.

After that things quickly go downwards.

But before I got there I was led down a few too many rabbit holes trying
to figure out wtf happened.


We could probably fix it differently though. But by the time I more or
less understood things I was too tired to make something pretty.

Anyway; if you get to do something if cpu_function_call() fails; you
have to also check if it got back up since you tried; at which point
you've got the same pattern as we have for task_function_call().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ