[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250527161656.GJ2566836@e132581.arm.com>
Date: Tue, 27 May 2025 17:16:56 +0100
From: Leo Yan <leo.yan@....com>
To: kan.liang@...ux.intel.com
Cc: peterz@...radead.org, mingo@...hat.com, namhyung@...nel.org,
irogers@...gle.com, mark.rutland@....com,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
eranian@...gle.com, ctshao@...gle.com, tmricht@...ux.ibm.com,
Aishwarya.TCV@....com
Subject: Re: [PATCH V4 01/16] perf: Fix the throttle logic for a group
Hi Kan,
[ + Aishwarya ]
On Tue, May 20, 2025 at 11:16:29AM -0700, kan.liang@...ux.intel.com wrote:
[...]
> @@ -10331,8 +10358,7 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
> if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
> __this_cpu_inc(perf_throttled_count);
> tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
> - hwc->interrupts = MAX_INTERRUPTS;
> - perf_log_throttle(event, 0);
> + perf_event_throttle_group(event);
> ret = 1;
> }
Our (Arm) CI reports RCU stall that caused by this patch. I can use a
simple command to trigger system stuck with cpu-clock:
perf record -a -e cpu-clock -- sleep 2
I confirmed that if removing throttling code for cpu-clock event, then
the issue can be dimissed. Based on reading code, the flow below:
hrtimer interrupt:
`> __perf_event_account_interrupt()
`> perf_event_throttle_group()
`> perf_event_throttle()
`> cpu_clock_event_stop()
`> perf_swevent_cancel_hrtimer()
`> hrtimer_cancel() -> Inifite loop.
In the hrtimer interrupt handler, it tries to cancel itself and causes
inifite loop. Please consider to fix the issue.
Thanks,
Leo
Powered by blists - more mailing lists