[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251209155722.GX724103@e132581.arm.com>
Date: Tue, 9 Dec 2025 15:57:22 +0000
From: Leo Yan <leo.yan@....com>
To: Anshuman Khandual <anshuman.khandual@....com>
Cc: Suzuki K Poulose <suzuki.poulose@....com>,
Mike Leach <mike.leach@...aro.org>,
James Clark <james.clark@...aro.org>,
Yeoreum Yun <yeoreum.yun@....com>, Will Deacon <will@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Tamas Petz <tamas.petz@....com>,
Tamas Zsoldos <tamas.zsoldos@....com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>, coresight@...ts.linaro.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org
Subject: Re: [PATCH 06/19] coresight: trbe: Refactor syndrome decoding
On Fri, Dec 05, 2025 at 09:40:33AM +0530, Anshuman Khandual wrote:
> On 01/12/25 4:51 PM, Leo Yan wrote:
> > It gives priority to TRBSR_EL1.EA (external abort); an external abort
> > will immediately bail out and return an error.
>
> This is an abrupt starting for a commit message without giving context.
> The rationale for the change needs to be explained rather than details
> of implementation.
Agreed. I will refine the commit log to state that we need to diagnose
the error code based on the hierarchy information (EC and BSC).
> > Next, the syndrome decoding is refactored based on two levels of
> > information: the EC (Event Class) bits and the BSC (Trace Buffer Status
> > Code) bits.
> >
> > If TRBSR_EL1.EC==0b000000, the driver continues parsing TRBSR_EL1.BSC to
> > identify the specific trace buffer event. Otherwise, any non-zero
> > TRBSR_EL1.EC is treated as an error.
> >
> > For error cases, the driver prints an error string and dumps registers
> > for debugging.
> >
> > No additional checks are required for wrap mode beyond verifying the
> > TRBSR_EL1.WRAP bit, even on units with overwrite errata, as this bit
> > reliably indicates a buffer wrap.
>
> Should the errata related changes be done in a separate patch instead ?
Good point. I will consider extracting the change; this would make
the review clearer.
[...]
> > static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
> > u64 trbsr)
> > {
> > + const char *err_str;
> > int ec = get_trbe_ec(trbsr);
> > int bsc = get_trbe_bsc(trbsr);
> > - struct trbe_buf *buf = etm_perf_sink_config(handle);
> > - struct trbe_cpudata *cpudata = buf->cpudata;
> >
> > WARN_ON(is_trbe_running(trbsr));
> > - if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> > - return TRBE_FAULT_ACT_FATAL;
> >
> > - if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
> > - return TRBE_FAULT_ACT_FATAL;
> > + if (is_trbe_abort(trbsr)) {
> > + err_str = "External abort";
> > + goto out_fatal;
> > + }
> >
> > - /*
> > - * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> > - * it might write data after a WRAP event in the fill mode.
> > - * Thus the check TRBPTR == TRBBASER will not be honored.
> > - */
> > - if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
> > - (trbe_may_overwrite_in_fill_mode(cpudata) ||
> > - get_trbe_write_pointer() == get_trbe_base_pointer()))
> > + switch (ec) {
> > + case TRBE_EC_OTHERS:
> > + break;
>
> No message for this ?
No. EC_OTHERS means "this is a normal maintenance interrupt".
> > + case TRBE_EC_BUF_MGMT_IMPL:
> > + err_str = "Unexpected implemented management";
> > + goto out_fatal;
> > + case TRBE_EC_GP_CHECK_FAULT:
> > + err_str = "Granule Protection Check fault";
> > + goto out_fatal;
> > + case TRBE_EC_STAGE1_ABORT:
> > + err_str = "Stage 1 data abort";
> > + goto out_fatal;
> > + case TRBE_EC_STAGE2_ABORT:
> > + err_str = "Stage 2 data abort";
> > + goto out_fatal;
> > + default:
> > + err_str = "Unknown error code";
> > + goto out_fatal;
> > + }
> > +
> > + switch (bsc) {
> > + case TRBE_BSC_NOT_STOPPED:
> > + break;
> > + case TRBE_BSC_FILLED:
> > + break;
> > + case TRBE_BSC_TRIGGERED:
> > + err_str = "Unexpected trigger status";
> > + goto out_fatal;
> > + default:
> > + err_str = "Unexpected buffer status code";
> > + goto out_fatal;
> > + }
>
> Just wondering if it would be cleaner to add a const char * based
> static array mapping these EC/BSC codes with above error messages.
Now we have two level's (EC/BSC) error codes, it would be complex to
maintain a string array and map to two level's error codes.
> > + if (is_trbe_wrap(trbsr))
>
> But what about TRBE affected with trbe_may_overwrite_in_fill_mode()
> is that still being taken care of some how ?
The TRBSR_EL1.WRAP bit reliably indicates that the pointer has
wrapped. I don't think we need any special handling for the overwrite
erratum here; we can simply return TRBE_FAULT_ACT_WRAP.
Afterward, trbe_get_trace_size() consumes the wrap flag and will
compute the trace size appropriately for the overwrite erratum when
the wrap flag is set.
[...]
> > -#define TRBE_EC_OTHERS 0
> > -#define TRBE_EC_STAGE1_ABORT 36
> > -#define TRBE_EC_STAGE2_ABORT 37
> > +#define TRBE_EC_OTHERS 0x0
> > +#define TRBE_EC_GP_CHECK_FAULT 0X1e
> > +#define TRBE_EC_BUF_MGMT_IMPL 0x1f
> > +#define TRBE_EC_STAGE1_ABORT 0x24
> > +#define TRBE_EC_STAGE2_ABORT 0x25
>
> Please do mention the document source for these new EC codes in the
> commit message.
As James suggested in another reply, I will consider to add sysreg
enum.
Thanks,
Leo
Powered by blists - more mailing lists