[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210124000526.GE138414@krava>
Date: Sun, 24 Jan 2021 01:05:26 +0100
From: Jiri Olsa <jolsa@...hat.com>
To: Alexandre Truong <alexandre.truong@....com>
Cc: linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
John Garry <john.garry@...wei.com>,
Will Deacon <will@...nel.org>,
Mathieu Poirier <mathieu.poirier@...aro.org>,
Leo Yan <leo.yan@...aro.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Namhyung Kim <namhyung@...nel.org>,
Kemeng Shi <shikemeng@...wei.com>,
Ian Rogers <irogers@...gle.com>,
Andi Kleen <ak@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Jin Yao <yao.jin@...ux.intel.com>,
Adrian Hunter <adrian.hunter@...el.com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Al Grant <al.grant@....com>, James Clark <james.clark@....com>,
Wilco Dijkstra <wilco.dijkstra@....com>
Subject: Re: [PATCH 4/4] perf tools: determine if LR is the return address
On Fri, Jan 22, 2021 at 04:18:54PM +0000, Alexandre Truong wrote:
> On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
> use dwarf unwind info to check if the link register is the return
> address in order to inject it to the frame pointer stack.
>
> Write the following application:
>
> int a = 10;
>
> void f2(void)
> {
> for (int i = 0; i < 1000000; i++)
> a *= a;
> }
>
> void f1()
> {
> f2();
> }
>
> int main (void)
> {
> f1();
> return 0;
> }
>
> with the following compilation flags:
> gcc -g -fno-omit-frame-pointer -fno-inline -O1
>
> The compiler omits the frame pointer for f2 on arm. This is a problem
> with any leaf call, for example an application with many different
> calls to malloc() would always omit the calling frame, even if it
> can be determined.
>
> ./perf record --call-graph fp ./a.out
> ./perf report
>
> currently gives the following stack:
>
> 0xffffea52f361
> _start
> __libc_start_main
> main
> f2
reproduced on x86 as well
> +static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
> +{
> + return callchain_param.record_mode != CALLCHAIN_FP || !sample->user_regs.regs
> + || sample->user_regs.mask != PERF_REGS_MASK;
> +}
> +
> +static int add_entry(struct unwind_entry *entry, void *arg)
> +{
> + struct entries *entries = arg;
> +
> + entries->stack[entries->i++] = entry->ip;
> + return 0;
> +}
> +
> +u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread)
> +{
> + u64 leaf_frame;
> + struct entries entries = {{0, 0}, 0};
> +
> + if (get_leaf_frame_caller_enabled(sample))
the name suggest you'd want to continue if it's true
> + return 0;
> +
> + unwind__get_entries(add_entry, &entries, thread, sample, 2);
I'm scratching my head how this unwinds anything, you enabled just
registers, not the stack right? so the unwind code would do just
IP -> LR + 1 shift?
thanks,
jirka
> + leaf_frame = callchain_param.order == ORDER_CALLER ?
> + entries.stack[0] : entries.stack[1];
> +
> + if (leaf_frame + 1 == sample->user_regs.regs[PERF_REG_ARM64_LR])
> + return sample->user_regs.regs[PERF_REG_ARM64_LR];
> + return 0;
> +}
SNIP
Powered by blists - more mailing lists