linux-kernel - Re: [PATCH v3] perf: arm_spe: Properly set hw.state on failures

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20260122163010.GA40455@e132581.arm.com>
Date: Thu, 22 Jan 2026 16:30:10 +0000
From: Leo Yan <leo.yan@....com>
To: Will Deacon <will@...nel.org>
Cc: Mark Rutland <mark.rutland@....com>,
	Alexandru Elisei <alexandru.elisei@....com>,
	James Clark <james.clark@...aro.org>,
	linux-arm-kernel@...ts.infradead.org,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] perf: arm_spe: Properly set hw.state on failures

On Thu, Jan 22, 2026 at 01:38:24PM +0000, Will Deacon wrote:
> On Wed, Jan 21, 2026 at 11:33:21AM +0000, Leo Yan wrote:

[...]

> > +static void arm_spe_pmu_stop(struct perf_event *event, int flags);
> 
> This is fine, but I'm also happy if you want to move the functions around
> to avoid the forward declaration.

I also dislike the forward declaration, but the current code is well
organized (the PMU start/stop/add/del helpers are grouped together).

Moving arm_spe_pmu_stop() elsewhere might hurt the readability.

> > @@ -642,6 +643,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle,
> >  
> >  out_write_limit:
> >  	write_sysreg_s(limit, SYS_PMBLIMITR_EL1);
> > +	return (limit & PMBLIMITR_EL1_E) ? 0 : -EAGAIN;
> 
> I'd probably go with -EIO here. -EAGAIN implies that if the caller
> retries the operation then it might succeed, which probably isn't the
> case for these failures.

Will do.

> >  static void arm_spe_perf_aux_output_end(struct perf_output_handle *handle)
> > @@ -781,7 +783,10 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev)
> >  		 * when we get to it.
> >  		 */
> >  		if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) {
> > -			arm_spe_perf_aux_output_begin(handle, event);
> > +			if (arm_spe_perf_aux_output_begin(handle, event)) {
> > +				arm_spe_pmu_stop(event, PERF_EF_UPDATE);
> 
> Why do you need to pass PERF_EF_UPDATE in this case?

The main purpose is to read PMSICR_EL1 and save the left count to
"hwc->period_left".  This might be used in next enable.

> It looks to me
> like we're going to get into a mess with PMBSR_EL1, as that will get
> re-read by arm_spe_pmu_buf_get_fault_act() in arm_spe_pmu_stop()
> before we've cleared it here in the irq handler.

When arm_spe_perf_aux_output_begin() fails, either because
perf_aux_output_begin() fails to start a new session or because an
invalid limit is detected and perf_aux_output_end(handle, 0) is
called.

In either case, perf_get_aux(handle) returns NULL after the failure,
and arm_spe_pmu_stop() has no chance to run into the path that
re-reads PMBSR_EL1.

> I was expecting that we would always pass 0 for the flags when handling
> the case where we get an error back from arm_spe_perf_aux_output_begin().

We can look at this another way.  If we do not call arm_spe_pmu_stop()
in the interrupt handler and instead defer stopping the trace to
arm_spe_pmu_del(), the PERF_EF_UPDATE flag is used.  This patch simply
keeps the same behavior while stopping the trace earlier in the
interrupt handler.  Make sense?

Thanks,
Leo