[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250528161722.GL2566836@e132581.arm.com>
Date: Wed, 28 May 2025 17:17:22 +0100
From: Leo Yan <leo.yan@....com>
To: kan.liang@...ux.intel.com
Cc: peterz@...radead.org, mingo@...hat.com, namhyung@...nel.org,
irogers@...gle.com, mark.rutland@....com,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
eranian@...gle.com, ctshao@...gle.com, tmricht@...ux.ibm.com,
Aishwarya TCV <aishwarya.tcv@....com>
Subject: Re: [PATCH] perf: Fix the throttle error of some clock events
On Wed, May 28, 2025 at 07:48:23AM -0700, kan.liang@...ux.intel.com wrote:
> From: Kan Liang <kan.liang@...ux.intel.com>
>
> The Arm CI reports RCU stall, which can be reproduced by the below perf
> command.
> perf record -a -e cpu-clock -- sleep 2
>
> The cpu-clock and task_clock are two special SW events, which rely on
> the hrtimer. Instead of invoking the stop(), the HRTIMER_NORESTART is
> returned to stop the timer. Because the hrtimer interrupt handler cannot
> cancel itself, which causes infinite loop.
>
> There may be two ways to fix it.
> - Add a check of MAX_INTERRUPTS in the event_stop. Return immediately if
> the stop is invoked by the throttle.
> - Introduce a PMU flag to track the case. Avoid the event_stop in
> perf_event_throttle() if the flag is detected.
>
> The latter looks more generic. It may be used if there are more other
> cases that want to avoid the stop later. The latter is implemented.
>
> Reported-by: Leo Yan <leo.yan@....com>
> Reported-by: Aishwarya TCV <aishwarya.tcv@....com>
Thanks for adding me and my colleague's name!
> Closes: https://lore.kernel.org/lkml/20250527161656.GJ2566836@e132581.arm.com/
> Signed-off-by: Kan Liang <kan.liang@...ux.intel.com>
> ---
> include/linux/perf_event.h | 1 +
> kernel/events/core.c | 23 ++++++++++++++++++++---
> 2 files changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 947ad12dfdbe..66f02f46595c 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -303,6 +303,7 @@ struct perf_event_pmu_context;
> #define PERF_PMU_CAP_AUX_OUTPUT 0x0080
> #define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100
> #define PERF_PMU_CAP_AUX_PAUSE 0x0200
> +#define PERF_PMU_CAP_NO_THROTTLE_STOP 0x0400
When applying the patch on the top of tip/sched/core, I found it is
conflict with a new added PERF_PMU_CAP_AUX_PREFER_LARGE. So need to
change the PERF_PMU_CAP_NO_THROTTLE_STOP to 0x0800.
With the change, I tested on Arm64 board with two commands:
perf record -a -e cpu-clock -- sleep 2
perf test "perftool-testsuite_report"
Tested-by: Leo Yan <leo.yan@....com>
Thanks for quick fixing.
> /**
> * pmu::scope
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 8327ab0ee641..4df274705038 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2655,7 +2655,22 @@ static void perf_event_unthrottle(struct perf_event *event, bool start)
>
> static void perf_event_throttle(struct perf_event *event)
> {
> - event->pmu->stop(event, 0);
> + /*
> + * Some PMUs, e.g., cpu-clock and task_clock, may rely on
> + * a special mechanism (hrtimer) to manipulate counters.
> + * The regular stop doesn't work, since the hrtimer interrupt
> + * handler cannot cancel itself.
> + *
> + * The stop should be avoided for such cases. Let the
> + * driver-specific code handle it.
> + *
> + * The counters will eventually be disabled in the driver-specific
> + * code. In unthrottle, they still need to be re-enabled.
> + * There is no handling for PERF_PMU_CAP_NO_THROTTLE_STOP in
> + * the perf_event_unthrottle().
> + */
> + if (!(event->pmu->capabilities & PERF_PMU_CAP_NO_THROTTLE_STOP))
> + event->pmu->stop(event, 0);
> event->hw.interrupts = MAX_INTERRUPTS;
> perf_log_throttle(event, 0);
> }
> @@ -11846,7 +11861,8 @@ static int cpu_clock_event_init(struct perf_event *event)
> static struct pmu perf_cpu_clock = {
> .task_ctx_nr = perf_sw_context,
>
> - .capabilities = PERF_PMU_CAP_NO_NMI,
> + .capabilities = PERF_PMU_CAP_NO_NMI |
> + PERF_PMU_CAP_NO_THROTTLE_STOP,
> .dev = PMU_NULL_DEV,
>
> .event_init = cpu_clock_event_init,
> @@ -11928,7 +11944,8 @@ static int task_clock_event_init(struct perf_event *event)
> static struct pmu perf_task_clock = {
> .task_ctx_nr = perf_sw_context,
>
> - .capabilities = PERF_PMU_CAP_NO_NMI,
> + .capabilities = PERF_PMU_CAP_NO_NMI |
> + PERF_PMU_CAP_NO_THROTTLE_STOP,
> .dev = PMU_NULL_DEV,
>
> .event_init = task_clock_event_init,
> --
> 2.38.1
>
Powered by blists - more mailing lists