[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzM/BUsBnX18NoOG@hirez.programming.kicks-ass.net>
Date: Tue, 27 Sep 2022 20:20:53 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Marco Elver <elver@...gle.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
kasan-dev@...glegroups.com, Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [PATCH] perf: Fix missing SIGTRAPs due to pending_disable abuse
On Tue, Sep 27, 2022 at 02:13:22PM +0200, Marco Elver wrote:
> Due to the implementation of how SIGTRAP are delivered if
> perf_event_attr::sigtrap is set, we've noticed 3 issues:
>
> 1. Missing SIGTRAP due to a race with event_sched_out() (more
> details below).
>
> 2. Hardware PMU events being disabled due to returning 1 from
> perf_event_overflow(). The only way to re-enable the event is
> for user space to first "properly" disable the event and then
> re-enable it.
>
> 3. The inability to automatically disable an event after a
> specified number of overflows via PERF_EVENT_IOC_REFRESH.
>
> The worst of the 3 issues is problem (1), which occurs when a
> pending_disable is "consumed" by a racing event_sched_out(), observed as
> follows:
>
> CPU0 | CPU1
> --------------------------------+---------------------------
> __perf_event_overflow() |
> perf_event_disable_inatomic() |
> pending_disable = CPU0 | ...
> | _perf_event_enable()
> | event_function_call()
> | task_function_call()
> | /* sends IPI to CPU0 */
> <IPI> | ...
> __perf_event_enable() +---------------------------
> ctx_resched()
> task_ctx_sched_out()
> ctx_sched_out()
> group_sched_out()
> event_sched_out()
> pending_disable = -1
> </IPI>
> <IRQ-work>
> perf_pending_event()
> perf_pending_event_disable()
> /* Fails to send SIGTRAP because no pending_disable! */
> </IRQ-work>
>
> In the above case, not only is that particular SIGTRAP missed, but also
> all future SIGTRAPs because 'event_limit' is not reset back to 1.
>
> To fix, rework pending delivery of SIGTRAP via IRQ-work by introduction
> of a separate 'pending_sigtrap', no longer using 'event_limit' and
> 'pending_disable' for its delivery.
>
> During testing, this also revealed several more possible races between
> reschedules and pending IRQ work; see code comments for details.
Perhaps use task_work_add() for this case? That runs on the
return-to-user path, so then it doesn't matter how many reschedules
happen in between.
The only concern is that task_work_add() uses kasan_record_aux_stack()
which obviously isn't NMI clean, so that would need to get removed or
made conditional.
Powered by blists - more mailing lists