[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251125142036.GE724103@e132581.arm.com>
Date: Tue, 25 Nov 2025 14:20:36 +0000
From: Leo Yan <leo.yan@....com>
To: Will Deacon <will@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>,
Alexandru Elisei <alexandru.elisei@....com>,
James Clark <james.clark@...aro.org>,
linux-arm-kernel@...ts.infradead.org,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] perf: arm_spe: Correct setting the
PERF_HES_STOPPED flag
On Mon, Nov 24, 2025 at 07:02:06PM +0000, Will Deacon wrote:
[...]
> > > If __arm_spe_pmu_next_off() fails, it will call perf_aux_output_end()
> > > with the TRUNCATED flag set, which should then disable the event
> > > via arm_spe_pmu_del() and update the state there.
> > >
> > > Is that not happening?
> >
> > Correct. However, this patch is not for the flow you mentioned.
>
> How is it not for this flow? You're talking about:
>
> arm_spe_pmu_start
> => arm_spe_perf_aux_output_begin
> => arm_spe_pmu_next_off // Returns error
>
> The only way arm_spe_pmu_next_off() returns an error is if
> __arm_spe_pmu_next_off() fails, and that's the flow I'm talking about.
My bad. Because you mentioned the TRUNCATED flag, I incorrectly assumed
it had to be used in interrupt handler with the disable irq work.
> > If an error is returned from arm_spe_pmu_next_off(), because hw.state
> > is not set to PERF_HES_STOPPED, the caller arm_spe_pmu_start() cannot
> > detect error properly:
>
> But why isn't PERF_HES_STOPPED set by the sequence I described?
Fair point. I can confirm after settting the TRUNCATED flag,
arm_spe_pmu_del() will be invoked to disable the trace unit and state
will be updated to PERF_HES_STOPPED.
> I have a feeling you're right, but I can't piece it together from the
> information here.
Let me explain in another way:
The issue is a mismatch between the state machine and the hardware
state. When arm_spe_perf_aux_output_begin() detects an error and does
not set PMBLIMITR_EL1_E, the trace unit is effectively stopped, but
the state machine is not updated to PERF_HES_STOPPED. This causes
callers to handle errors incorrectly [1][2].
It is arguable that the disable IRQ work will eventually disable the
trace unit and update hw.state, but the state should be updated in the
first place by the PMU driver to notify even core layer.
Thanks,
Leo
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/perf/arm_spe_pmu.c?//h=v6.18-rc7#n855
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/events/core.c#n2742
Powered by blists - more mailing lists