[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140902185807.GA7434@leverpostej>
Date: Tue, 2 Sep 2014 19:58:07 +0100
From: Mark Rutland <mark.rutland@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"Yan@...erpostej.cambridge.arm.com"
<Yan@...erpostej.cambridge.arm.com>, Zheng <zheng.z.yan@...el.com>,
Stephane Eranian <eranian@...gle.com>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: Possible race between CPU hotplug and perf_pmu_migrate_context
On Mon, Sep 01, 2014 at 08:05:34PM +0100, Peter Zijlstra wrote:
> On Mon, Sep 01, 2014 at 07:18:08PM +0100, Mark Rutland wrote:
> > Hi all,
>
> > [ 66.780759] [<ffffffff8109dd33>] rcu_process_callbacks+0x1e3/0x540
>
> > Has anything seen anything like this before? Is this a known issue?
>
> I've not seen it reported.. sounds like 'fun' though.
>
This has been a tremendous source of 'fun' so far...
The rcu_process_callbacks line is a red herring. What seems to be
happening is:
A CPU goes down, and perf_pmu_migrate_context removes all events from
per_cpu_ptr(pmu->pmu_cpu_context, src_cpu)->ctx. The events are in a
state of limbo, with their ctx pointers pointing at the old context,
whose refcount is 1. The src_ctx->mutex is unlocked.
Concurrently on another CPU the fds are closed, and perf_event_release
goes and removes each event from their event->ctx. We skip the double
detach in list_del_event and carry on to __free_event where we put_ctx
the old context for a second time for each event. The refcount goes to 0
and we queue a kfree_rcu of the context (inside the PMU's percpu
perf_event_cpu_context, allocated with alloc_percpu).
We run the queued kfree_rcu, and explode trying to kfree something we
didn't k*alloc. I'm not sure when exactly we run the queued kfree_rcu
w.r.t. everything else.
So the problem here seems to be a race between the
perf_pmu_migrate_context and something down the perf_event_release
callchain.
Any ideas?
Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists