[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fX9+XouUy=KeFbHw_GEMKjHteg_0OPvvohhe6hxgVP6UQ@mail.gmail.com>
Date: Wed, 4 Jun 2025 13:08:57 -0700
From: Ian Rogers <irogers@...gle.com>
To: kan.liang@...ux.intel.com
Cc: peterz@...radead.org, mingo@...hat.com, namhyung@...nel.org,
mark.rutland@....com, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, eranian@...gle.com, ctshao@...gle.com,
tmricht@...ux.ibm.com, Leo Yan <leo.yan@....com>,
Aishwarya TCV <aishwarya.tcv@....com>, Alexei Starovoitov <alexei.starovoitov@...il.com>,
Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
Subject: Re: [PATCH V3] perf: Fix the throttle error of some clock events
On Wed, Jun 4, 2025 at 10:16 AM <kan.liang@...ux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@...ux.intel.com>
>
> Both ARM and IBM CI reports RCU stall, which can be reproduced by the
> below perf command.
> perf record -a -e cpu-clock -- sleep 2
>
> The issue is introduced by the generic throttle patch set, which
> unconditionally invoke the event_stop() when throttle is triggered.
>
> The cpu-clock and task-clock are two special SW events, which rely on
> the hrtimer. The throttle is invoked in the hrtimer handler. The
> event_stop()->hrtimer_cancel() waits for the handler to finish, which is
> a deadlock. Instead of invoking the stop(), the HRTIMER_NORESTART should
> be used to stop the timer.
>
> There may be two ways to fix it.
> - Introduce a PMU flag to track the case. Avoid the event_stop in
> perf_event_throttle() if the flag is detected.
> It has been implemented in the
> https://lore.kernel.org/lkml/20250528175832.2999139-1-kan.liang@linux.intel.com/
> The new flag was thought to be an overkill for the issue.
> - Add a check in the event_stop. Return immediately if the throttle is
> invoked in the hrtimer handler. Rely on the existing HRTIMER_NORESTART
> method to stop the timer.
>
> The latter is implemented here.
>
> Move event->hw.interrupts = MAX_INTERRUPTS before the stop(). It makes
> the order the same as perf_event_unthrottle(). Except the patch, no one
> checks the hw.interrupts in the stop(). There is no impact from the
> order change.
>
> Reported-by: Leo Yan <leo.yan@....com>
> Reported-by: Aishwarya TCV <aishwarya.tcv@....com>
> Closes: https://lore.kernel.org/lkml/20250527161656.GJ2566836@e132581.arm.com/
> Reported-by: Alexei Starovoitov <alexei.starovoitov@...il.com>
> Closes: https://lore.kernel.org/lkml/djxlh5fx326gcenwrr52ry3pk4wxmugu4jccdjysza7tlc5fef@ktp4rffawgcw/
> Reported-by: Venkat Rao Bagalkote <venkat88@...ux.ibm.com>
> Closes: https://lore.kernel.org/lkml/8e8f51d8-af64-4d9e-934b-c0ee9f131293@linux.ibm.com/
> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
Reviewed-by: Ian Rogers <irogers@...gle.com>
Thanks,
Ian
> ---
>
> Changes since V2:
> - Apply a different way to fix the issue.
> Remove all Tested-by since a different way is applied
> - Update the change log
> - Add more Reported-by
>
> kernel/events/core.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index f34c99f8ce8f..46441c23475d 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2656,8 +2656,8 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
>
> static void perf_event_throttle(struct perf_event *event)
> {
> - event->pmu->stop(event, 0);
> event->hw.interrupts = MAX_INTERRUPTS;
> + event->pmu->stop(event, 0);
> if (event == event->group_leader)
> perf_log_throttle(event, 0);
> }
> @@ -11749,7 +11749,12 @@ static void perf_swevent_cancel_hrtimer(struct perf_event *event)
> {
> struct hw_perf_event *hwc = &event->hw;
>
> - if (is_sampling_event(event)) {
> + /*
> + * The throttle can be triggered in the hrtimer handler.
> + * The HRTIMER_NORESTART should be used to stop the timer,
> + * rather than hrtimer_cancel(). See perf_swevent_hrtimer()
> + */
> + if (is_sampling_event(event) && (hwc->interrupts != MAX_INTERRUPTS)) {
> ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer);
> local64_set(&hwc->period_left, ktime_to_ns(remaining));
>
> --
> 2.38.1
>
Powered by blists - more mailing lists