[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yz78MMMJ74tBw0gu@hirez.programming.kicks-ass.net>
Date: Thu, 6 Oct 2022 18:02:56 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Marco Elver <elver@...gle.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
kasan-dev@...glegroups.com, Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [PATCH] perf: Fix missing SIGTRAPs
On Thu, Oct 06, 2022 at 03:59:55PM +0200, Marco Elver wrote:
> That one I could fix up with:
>
> | diff --git a/kernel/events/core.c b/kernel/events/core.c
> | index 9319af6013f1..2f1d51b50be7 100644
> | --- a/kernel/events/core.c
> | +++ b/kernel/events/core.c
> | @@ -6563,6 +6563,7 @@ static void perf_pending_task(struct callback_head *head)
> | * If we 'fail' here, that's OK, it means recursion is already disabled
> | * and we won't recurse 'further'.
> | */
> | + preempt_disable_notrace();
> | rctx = perf_swevent_get_recursion_context();
> |
> | if (event->pending_work) {
> | @@ -6573,6 +6574,7 @@ static void perf_pending_task(struct callback_head *head)
> |
> | if (rctx >= 0)
> | perf_swevent_put_recursion_context(rctx);
> | + preempt_enable_notrace();
> | }
> |
> | #ifdef CONFIG_GUEST_PERF_EVENTS
Right, thanks! It appears I only have lockdep enabled but not the
preempt warning :/
> But following that, I get:
>
> | WARNING: CPU: 3 PID: 13018 at kernel/events/core.c:2288 event_sched_out+0x3f2/0x410 kernel/events/core.c:2288
I'm taking this is (my line numbers are slightly different):
WARN_ON_ONCE(event->pending_work);
> So something isn't quite right yet. Unfortunately I don't have a good
> reproducer. :-/
This can happen if we get two consecutive event_sched_out() and both
instances will have pending_sigtrap set. This can happen when the event
that has sigtrap set also triggers in kernel space.
You then get task_work list corruption and *boom*.
I'm thinking the below might be the simplest solution; we can only send
a single signal after all.
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2293,9 +2293,10 @@ event_sched_out(struct perf_event *event
*/
local_dec(&event->ctx->nr_pending);
} else {
- WARN_ON_ONCE(event->pending_work);
- event->pending_work = 1;
- task_work_add(current, &event->pending_task, TWA_RESUME);
+ if (!event->pending_work) {
+ event->pending_work = 1;
+ task_work_add(current, &event->pending_task, TWA_RESUME);
+ }
}
}
Powered by blists - more mailing lists