[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250720000424.12572-1-thaumy.love@gmail.com>
Date: Sun, 20 Jul 2025 08:04:24 +0800
From: thaumy.love@...il.com
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org,
Thaumy Cheng <thaumy.love@...il.com>
Subject: [PATCH] perf/core: Fix missing read event generation on task exit
From: Thaumy Cheng <thaumy.love@...il.com>
For events with inherit_stat enabled, a "read" event will be generated
to collect per task event counts on task exit.
The call chain is as follows:
do_exit
-> perf_event_exit_task
-> perf_event_exit_task_context
-> perf_event_exit_event
-> perf_remove_from_context
-> perf_child_detach
-> sync_child_event
-> perf_event_read_event
However, the child event context detaches the task too early in
perf_event_exit_task_context, which causes sync_child_event to never
generate the read event in this case, since child_event->ctx->task is
always set to TASK_TOMBSTONE. This patch intends to fix that.
This bug can be reproduced by running "perf record -s" and attaching to
any program that generates perf events in its child tasks. If we check
the result with "perf report -T", the last line of the report will leave
an empty table like "# PID TID", which is expected to contain the
per-task event counts by design.
Signed-off-by: Thaumy Cheng <thaumy.love@...il.com>
---
kernel/events/core.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 22fdf0c187cd..266b9eabb342 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -14057,6 +14057,17 @@ static void perf_event_exit_task_context(struct task_struct *task, bool exit)
*/
mutex_lock(&ctx->mutex);
+ /*
+ * Report the task dead after unscheduling the events so that we
+ * won't get any samples after PERF_RECORD_EXIT. We can however still
+ * get a few PERF_RECORD_READ events.
+ */
+ if (exit)
+ perf_event_task(task, ctx, 0);
+
+ list_for_each_entry_safe(child_event, next, &ctx->event_list, event_entry)
+ perf_event_exit_event(child_event, ctx, false);
+
/*
* In a single ctx::lock section, de-schedule the events and detach the
* context from the task such that we cannot ever get it scheduled back
@@ -14081,17 +14092,6 @@ static void perf_event_exit_task_context(struct task_struct *task, bool exit)
if (clone_ctx)
put_ctx(clone_ctx);
- /*
- * Report the task dead after unscheduling the events so that we
- * won't get any samples after PERF_RECORD_EXIT. We can however still
- * get a few PERF_RECORD_READ events.
- */
- if (exit)
- perf_event_task(task, ctx, 0);
-
- list_for_each_entry_safe(child_event, next, &ctx->event_list, event_entry)
- perf_event_exit_event(child_event, ctx, false);
-
mutex_unlock(&ctx->mutex);
if (!exit) {
--
2.49.0
Powered by blists - more mailing lists