[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1275557202.27810.35201.camel@twins>
Date: Thu, 03 Jun 2010 11:26:42 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Frederic Weisbecker <fweisbec@...il.com>
Cc: Ingo Molnar <mingo@...e.hu>, LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Paul Mackerras <paulus@...ba.org>,
Stephane Eranian <eranian@...gle.com>
Subject: Re: [GIT PULL] perf crash fix
On Thu, 2010-06-03 at 05:13 +0200, Frederic Weisbecker wrote:
> What happens here is a double pmu->disable() due to a race between
> two perf_adjust_period().
>
> We first overflow a page fault event and then re-adjust the period.
> When we reset the period_left, we stop the pmu by removing the
> perf event from the software event hlist. And just before we
> re-enable it, we are interrupted by a sched tick that also tries to
> re-adjust the period. There we eventually disable the event a second
> time, which leads to a double hlist_del_rcu() that ends up
> dereferencing LIST_POISON2.
>
> In fact, the goal of embracing the reset of the period_left with
> a pmu:stop() and pmu:start() is only relevant to hardware events. We
> want them to reprogram the next period interrupt.
>
> But this is useless for software events. They have their own way to
> handle the period left, and in a non-racy way. They don't need to
> be stopped here.
>
> So, use a new pair of perf_event_stop/start_hwevent that only stop
> and restart hardware events in this path.
>
> The race won't happen with hardware events as sched ticks can't
> happen during nmis.
I've queued the below.
---
Subject: perf: Fix crash in swevents
From: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Date: Thu Jun 03 11:21:20 CEST 2010
Frederic reported that because swevents handling doesn't disable IRQs
anymore, we can get a recursion of perf_adjust_period(), once from
overflow handling and once from the tick.
If both call ->disable, we get a double hlist_del_rcu() and trigger
a LIST_POISON2 dereference.
Since we don't actually need to stop/start a swevent to re-programm
the hardware (lack of hardware to program), simply nop out these
callbacks for the swevent pmu.
Reported-by: Frederic Weisbecker <fweisbec@...il.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
---
kernel/perf_event.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)
Index: linux-2.6/kernel/perf_event.c
===================================================================
--- linux-2.6.orig/kernel/perf_event.c
+++ linux-2.6/kernel/perf_event.c
@@ -4055,13 +4055,6 @@ static void perf_swevent_overflow(struct
}
}
-static void perf_swevent_unthrottle(struct perf_event *event)
-{
- /*
- * Nothing to do, we already reset hwc->interrupts.
- */
-}
-
static void perf_swevent_add(struct perf_event *event, u64 nr,
int nmi, struct perf_sample_data *data,
struct pt_regs *regs)
@@ -4276,11 +4269,17 @@ static void perf_swevent_disable(struct
hlist_del_rcu(&event->hlist_entry);
}
+static void perf_swevent_nop(struct perf_event *event)
+{
+}
+
static const struct pmu perf_ops_generic = {
.enable = perf_swevent_enable,
.disable = perf_swevent_disable,
+ .start = perf_swevent_nop,
+ .stop = perf_swevent_nop,
.read = perf_swevent_read,
- .unthrottle = perf_swevent_unthrottle,
+ .unthrottle = perf_swevent_nop, /* hwc->interrupts already reset */
};
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists