[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250214202438.GB2198@noisy.programming.kicks-ass.net>
Date: Fri, 14 Feb 2025 21:24:38 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Ravi Bangoria <ravi.bangoria@....com>
Cc: mingo@...nel.org, lucas.demarchi@...el.com,
linux-kernel@...r.kernel.org, willy@...radead.org, acme@...nel.org,
namhyung@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
irogers@...gle.com, adrian.hunter@...el.com,
kan.liang@...ux.intel.com
Subject: Re: [PATCH v2 24/24] perf: Make perf_pmu_unregister() useable
On Thu, Feb 13, 2025 at 01:22:55PM +0530, Ravi Bangoria wrote:
> Apparently not, it ends up with:
>
> ------------[ cut here ]------------
> WARNING: CPU: 145 PID: 5459 at kernel/events/core.c:281 event_function+0xd2/0xf0
> WARNING: CPU: 145 PID: 5459 at kernel/events/core.c:286 event_function+0xd6/0xf0
> remote_function+0x4f/0x70
> generic_exec_single+0x7f/0x160
> smp_call_function_single+0x110/0x160
> event_function_call+0x98/0x1d0
> _perf_event_disable+0x41/0x70
> perf_event_for_each_child+0x40/0x90
> _perf_ioctl+0xac/0xb00
> perf_ioctl+0x45/0x80
Took me a long while trying to blame this on the 'event->parent =
NULL;', but AFAICT this is a new, unrelated issue.
What I think happens is this perf_ioctl(DISABLE) vs pmu_detach_events()
race, where the crux is that perf_ioctl() path does not take
event2->mutex which allows the following interleave:
event1 <---> ctx1
| ^
child_list | | parent
v |
event2 <---> ctx2
perf_ioctl()
perf_event_ctx_lock(event1)
get_ctx(ctx1)
mutex_lock(ctx1->mutex)
_perf_ioctk()
perf_event_for_each_child()
mutex_lock(event1->child_mutex)
_perf_event_disable(event1)
_perf_event_disable(event2)
raw_spin_lock_irq(ctx2->lock)
raw_spin_unlock_irq()
event_function_call(event2, __perf_event_disable)
task_function_call()
<IPI __perf_event_disable>
pmu_detach_events()
event2 = pmu_get_event() <-- inc(event2->refcount);
pmu_detach_event(event2)
perf_event_ctx_lock(event2)
get_ctx(ctx2)
mutex_lock(ctx2->lock)
__pmu_detach_event()
perf_event_exit_event()
mutex_lock(event1->child_mutex)
perf_remove_from_context(event2, EXIT|GROUP|CHILD|REVOKE)
lockdep_assert_held(ctx2->mutex)
raw_spin_lock_irq(ctx2->lock)
raw_spin_unlock_irq(ctx2->lock)
event_function_call(event2, __perf_remove_from_context)
task_function_call(event_function)
<IPI __perf_remove_from_context>
remote_function()
event_function()
perf_ctx_lock(cpuctx, ctx2)
raw_spin_lock(ctx2->lock)
__perf_remove_from_context(event2)
event_sched_out()
perf_group_detach()
perf_child_detach()
list_del_event()
event->state = REVOKED;
cpc->task_epc = NULL; // event2 is last
ctx->is_active = 0; <--.
cpuctx->task_ctx = NULL; |
|
|
<IPI __perf_event_disable> |
remote_function() |
event_function() |
perf_ctx_lock(cpuctx, ctx2) |
raw_spin_lock(ctx2->lock) |
|
WARN_ON_ONCE(!ctx2->is_active) -'
WARN_ON_ONCE(cpuctx->task_ctx != ctx2)
Still trying to work out how best to avoid this.
Powered by blists - more mailing lists