[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191104142654.GA24609@willie-the-truck>
Date: Mon, 4 Nov 2019 14:26:54 +0000
From: Will Deacon <will@...nel.org>
To: Shaokun Zhang <zhangshaokun@...ilicon.com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
Jiri Olsa <jolsa@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>, liuqi115@...ilicon.com,
huangdaode@...ilicon.com, john.garry@...wei.com,
Jonathan Cameron <Jonathan.Cameron@...wei.com>
Subject: Re: [RFC] About perf-mem command support on arm64 platform
On Mon, Nov 04, 2019 at 05:18:00PM +0800, Shaokun Zhang wrote:
> perf-mem is used to profile memory access which has been implemented on x86
> platform. It needs mem-stores events and mem-loads/load-latency.
> For mem-stores events, it is MEM_INST_RETIRED_ALL_STORES whose raw number
> is r82d0, and mem-loads/load-latency is from PEBS if I follow its code.
>
> Now, for some arm64 cores, like HiSilicon's tsv110 and ARM's Neoverse N1,
> has supported the SPE(Statistical Profiling Extensions), so is it a
> possibility that perf-mem is supported on arm64?
> https://developer.arm.com/ip-products/processors/neoverse/neoverse-n1
I don't understand the relationship you're trying to draw between mem-stores
and SPE. How does perf-mem work and what does it actually require from the
CPU?
One thing that may be worth noting is that SPE isn't generally able to
capture information about all instructions being executed by the CPU:
instead, it instructions (most likely micro-ops) are sampled based on
some user-specified period. The CPU advertises a minimum recommended
period which we expose under /sys and enforce when programming events.
> For arm64 PMU, it has 'st_retired' event that the event number is 0x0007
> which is equal to mem-stores on x86, if we want support perf-mem, it seems
> that 'st_retired' shall be replaced by 'mem-stores'
> in arch/arm64/kernel/perf_event.c file. Of course, the cpu core should
> support st_retired event. I'm not sure Will/Mark are happy on this.;-)
>
> For mem-loads/load-latency, we can derive them from SPE sampled data which
> supports by load_filter and min_latency in SPE driver. and we may do some
> work on tools/perf/builtin-mem.c.
I don't see how you could reconcile the sampling nature of SPE with a
CPU PMU counter, particularly as filtering in SPE happens /after/ sampling.
> From the above conditions, it seems that we may have the opportunity to
> support the perf-mem command on arm64.
> I'm not very sure about it, so I send this RFC and any comments are welcome.
I don't think there's enough information here to comment meaningfully more
than SPE != PEBS.
Will
Powered by blists - more mailing lists