[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ9a7VjTJ4E7=qdPgfTTVjoW01rCzd-f1mFYHjtA3BHk25saMA@mail.gmail.com>
Date: Fri, 11 Mar 2022 15:56:00 +0000
From: Mike Leach <mike.leach@...aro.org>
To: James Clark <James.Clark@....com>
Cc: Suzuki K Poulose <suzuki.poulose@....com>,
mathieu.poirier@...aro.org, coresight@...ts.linaro.org,
leo.yan@...aro.com, Leo Yan <leo.yan@...aro.org>,
John Garry <john.garry@...wei.com>,
Will Deacon <will@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v2 1/6] coresight: Add config flag to enable branch broadcast
Hi James,
On Fri, 11 Mar 2022 at 14:58, James Clark <james.clark@....com> wrote:
>
>
>
> On 02/02/2022 20:25, Mike Leach wrote:
> > Hi James, Suzuki
> >
> > On Fri, 28 Jan 2022 at 11:19, Suzuki K Poulose <suzuki.poulose@....com> wrote:
> >>
> >> On 13/01/2022 09:10, James Clark wrote:
> >>> When enabled, all taken branch addresses are output, even if the branch
> >>> was because of a direct branch instruction. This enables reconstruction
> >>> of the program flow without having access to the memory image of the
> >>> code being executed.
> >>>
> >>> Use bit 8 for the config option which would be the correct bit for
> >>> programming ETMv3. Although branch broadcast can't be enabled on ETMv3
> >>> because it's not in the define ETM3X_SUPPORTED_OPTIONS, using the
> >>> correct bit might help prevent future collisions or allow it to be
> >>> enabled if needed.
> >>>
> >>> Signed-off-by: James Clark <james.clark@....com>
> >>> ---
> >>> drivers/hwtracing/coresight/coresight-etm-perf.c | 2 ++
> >>> drivers/hwtracing/coresight/coresight-etm4x-core.c | 10 ++++++++++
> >>> include/linux/coresight-pmu.h | 2 ++
> >>> 3 files changed, 14 insertions(+)
> >>>
> >>> diff --git a/drivers/hwtracing/coresight/coresight-etm-perf.c b/drivers/hwtracing/coresight/coresight-etm-perf.c
> >>> index c039b6ae206f..43bbd5dc3d3b 100644
> >>> --- a/drivers/hwtracing/coresight/coresight-etm-perf.c
> >>> +++ b/drivers/hwtracing/coresight/coresight-etm-perf.c
> >>> @@ -52,6 +52,7 @@ static DEFINE_PER_CPU(struct coresight_device *, csdev_src);
> >>> * The PMU formats were orignally for ETMv3.5/PTM's ETMCR 'config';
> >>> * now take them as general formats and apply on all ETMs.
> >>> */
> >>> +PMU_FORMAT_ATTR(branch_broadcast, "config:"__stringify(ETM_OPT_BRANCH_BROADCAST));
> >>> PMU_FORMAT_ATTR(cycacc, "config:" __stringify(ETM_OPT_CYCACC));
> >>> /* contextid1 enables tracing CONTEXTIDR_EL1 for ETMv4 */
> >>> PMU_FORMAT_ATTR(contextid1, "config:" __stringify(ETM_OPT_CTXTID));
> >>> @@ -97,6 +98,7 @@ static struct attribute *etm_config_formats_attr[] = {
> >>> &format_attr_sinkid.attr,
> >>> &format_attr_preset.attr,
> >>> &format_attr_configid.attr,
> >>> + &format_attr_branch_broadcast.attr,
> >>
> >> Does it make sense to hide the option if the bb is not supported ? I
> >> guess it will be tricky as we don't track the common feature set. So,
> >> that said...
> >>
> >>> NULL,
> >>> };
> >>>
> >>> diff --git a/drivers/hwtracing/coresight/coresight-etm4x-core.c b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> >>> index bf18128cf5de..04669ecc0efa 100644
> >>> --- a/drivers/hwtracing/coresight/coresight-etm4x-core.c
> >>> +++ b/drivers/hwtracing/coresight/coresight-etm4x-core.c
> >>> @@ -692,6 +692,16 @@ static int etm4_parse_event_config(struct coresight_device *csdev,
> >>> ret = cscfg_csdev_enable_active_config(csdev, cfg_hash, preset);
> >>> }
> >>>
> >>> + /* branch broadcast - enable if selected and supported */
> >>> + if (attr->config & BIT(ETM_OPT_BRANCH_BROADCAST)) {
> >>> + if (!drvdata->trcbb) {
> >>> + ret = -EINVAL;
> >>
> >> Should we fail here ? We could simply ignore this and generate the trace
> >> normally. This would work on a big.LITTLE system with one set missing
> >> the branch broadcast, while the others support.
> >>
> >> Mike,
> >>
> >> Does this affect the trace decoding ? As such the OpenCSD should be able
> >> to decode the packets as they appear in the stream. Correct ?
> >>
> >
> > Depends on what you mean by affect the trace decoding!
> > From the simplest perspective - no - there is no issue as the packets
> > will be decode as seen. THE decoder does not know that BB exists - nor
> > if it is enabled.
> >
> > However, the reason that a user may engage BB is that the code is in
> > some way self modifying - so that following the code static image and
> > calculating addresses will result in a different path taken.
> >
> > e.g. imagine an opcode:-
> >
> > B <address0>
> >
> > Without BB, this will trace as a single E atom, the decoder will
> > calculate address0 from the opcode in the static image and continue
> > from there as the next trace address.
> >
> > Now look at the case where this is changed on the fly to
> >
> > B <address1>
> >
> > With BB, This will trace to
> > E
> > TGT_ADDR<address1>
> >
> > The decoder will initially extract address0 from the static image,
> > but the immediately following target address packet will alter the
> > next address traced to address1
> > This is why we have BB.
> >
> > So if the user has a reason to engage BB - we should really fail if
> > it is not present - as the outcome of the trace can be affected.
>
> Hi Mike,
>
> Now I'm starting to wonder if it's best to have some kind of non-binary
> image mode for BB where you'd just get a list of addresses output by
> perf script and it doesn't attempt to do any lookups.
Not at all sure what you mean by this
Mike
> Although I think
> that would require a change in OpenCSD if it's not aware of the mode?
>
> I also need to go back to my JVM profiling test and see what's
> going on again. But I think I understand your points a bit more now.
>
> Thanks
> James
>
> >
> >> Suzuki
> >>
> >>
> >>> + goto out;
> >>> + } else {
> >>> + config->cfg |= BIT(ETM4_CFG_BIT_BB);
> >>> + }
> >>> + }
> >>> +
> >>> out:
> >>> return ret;
> >>> }
> >>> diff --git a/include/linux/coresight-pmu.h b/include/linux/coresight-pmu.h
> >>> index 4ac5c081af93..6c2fd6cc5a98 100644
> >>> --- a/include/linux/coresight-pmu.h
> >>> +++ b/include/linux/coresight-pmu.h
> >>> @@ -18,6 +18,7 @@
> >>> * ETMv3.5/PTM doesn't define ETMCR config bits with prefix "ETM3_" and
> >>> * directly use below macros as config bits.
> >>> */
> >>> +#define ETM_OPT_BRANCH_BROADCAST 8
> >>> #define ETM_OPT_CYCACC 12
> >>> #define ETM_OPT_CTXTID 14
> >>> #define ETM_OPT_CTXTID2 15
> >>> @@ -25,6 +26,7 @@
> >>> #define ETM_OPT_RETSTK 29
> >>>
> >>> /* ETMv4 CONFIGR programming bits for the ETM OPTs */
> >>> +#define ETM4_CFG_BIT_BB 3
> >>> #define ETM4_CFG_BIT_CYCACC 4
> >>> #define ETM4_CFG_BIT_CTXTID 6
> >>> #define ETM4_CFG_BIT_VMID 7
> >>
> >
> >
--
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK
Powered by blists - more mailing lists