[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150129144749.GC24151@twins.programming.kicks-ass.net>
Date: Thu, 29 Jan 2015 15:47:49 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: mingo@...nel.org, linux-kernel@...r.kernel.org
Cc: vincent.weaver@...ne.edu, eranian@...il.com, jolsa@...hat.com,
mark.rutland@....com, torvalds@...ux-foundation.org,
tglx@...utronix.de
Subject: Re: [RFC][PATCH 2/3] perf: Add a bit of paranoia
On Mon, Jan 26, 2015 at 05:26:39PM +0100, Peter Zijlstra wrote:
> On Fri, Jan 23, 2015 at 01:52:01PM +0100, Peter Zijlstra wrote:
> Jiri reported triggering that WARN_ON_ONCE over the weekend:
>
> event_sched_out.isra.79+0x2b9/0x2d0
> group_sched_out+0x69/0xc0
> ctx_sched_out+0x106/0x130
> task_ctx_sched_out+0x37/0x70
> __perf_install_in_context+0x70/0x1a0
> remote_function+0x48/0x60
> generic_exec_single+0x15b/0x1d0
> smp_call_function_single+0x67/0xa0
> task_function_call+0x53/0x80
> perf_install_in_context+0x8b/0x110
>
>
> I think the below should cure this; if we install a group leader it will
> iterate the (still intact) group list and find its siblings and try and
> install those too -- even though those still have the old event->ctx --
> in the new ctx.
>
> Upon installing the first group sibling we'd try and schedule out the
> group and trigger the above warn.
>
> Fix this by installing the group leader last, installing siblings would
> have no effect, they're not reachable through the group lists and
> therefore we don't schedule them.
>
> Also delay resetting the state until we're absolutely sure the events
> are quiescent -- which raises the question; should perf_pmu_migrate()
> not also have perf_event__state_init() calls in?
So perf_pmu_migrate_context() does an explicit OFF->INACTIVE change on
the install side, which seems good enough.
That said, it does need to do that sibling first leaders later install
order too. So I've put the below on top.
---
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -7817,8 +7817,35 @@ void perf_pmu_migrate_context(struct pmu
list_add(&event->migrate_entry, &events);
}
+ /*
+ * Wait for the events to quiesce before re-instating them.
+ */
synchronize_rcu();
+ /*
+ * Re-instate events in 2 passes.
+ *
+ * Skip over group leaders and only install siblings on this first
+ * pass, siblings will not get enabled without a leader, however a
+ * leader will enable its siblings, even if those are still on the old
+ * context.
+ */
+ list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
+ if (event->group_leader == event)
+ continue;
+
+ list_del(&event->migrate_entry);
+ if (event->state >= PERF_EVENT_STATE_OFF)
+ event->state = PERF_EVENT_STATE_INACTIVE;
+ account_event_cpu(event, dst_cpu);
+ perf_install_in_context(dst_ctx, event, dst_cpu);
+ get_ctx(dst_ctx);
+ }
+
+ /*
+ * Once all the siblings are setup properly, install the group leaders
+ * to make it go.
+ */
list_for_each_entry_safe(event, tmp, &events, migrate_entry) {
list_del(&event->migrate_entry);
if (event->state >= PERF_EVENT_STATE_OFF)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists