[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACPcB9cuvNt-+gw8_LzqDLZ-nZ+G_zLcSv6nftNONrYxVRx=wg@mail.gmail.com>
Date: Wed, 17 Apr 2019 22:44:33 +0800
From: Kairui Song <kasong@...hat.com>
To: Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>, Dave Young <dyoung@...hat.com>
Subject: Re: [RFC PATCH v2] perf/x86: make perf callchain work without CONFIG_FRAME_POINTER
On Wed, Apr 17, 2019 at 4:16 AM Josh Poimboeuf <jpoimboe@...hat.com> wrote:
>
> On Wed, Apr 17, 2019 at 01:39:19AM +0800, Kairui Song wrote:
> > On Tue, Apr 16, 2019 at 7:30 PM Kairui Song <kasong@...hat.com> wrote:
> > >
> > > On Tue, Apr 16, 2019 at 12:59 AM Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > > >
> > > > On Mon, Apr 15, 2019 at 05:36:22PM +0200, Peter Zijlstra wrote:
> > > > >
> > > > > I'll mostly defer to Josh on unwinding, but a few comments below.
> > > > >
> > > > > On Tue, Apr 09, 2019 at 12:59:42AM +0800, Kairui Song wrote:
> > > > > > diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> > > > > > index e2b1447192a8..6075a4f94376 100644
> > > > > > --- a/arch/x86/events/core.c
> > > > > > +++ b/arch/x86/events/core.c
> > > > > > @@ -2355,6 +2355,12 @@ void arch_perf_update_userpage(struct perf_event *event,
> > > > > > cyc2ns_read_end();
> > > > > > }
> > > > > >
> > > > > > +static inline int
> > > > > > +valid_perf_registers(struct pt_regs *regs)
> > > > > > +{
> > > > > > + return (regs->ip && regs->bp && regs->sp);
> > > > > > +}
> > > > >
> > > > > I'm unconvinced by this, with both guess and orc having !bp is perfectly
> > > > > valid.
> > > > >
> > > > > > void
> > > > > > perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *regs)
> > > > > > {
> > > > > > @@ -2366,11 +2372,17 @@ perf_callchain_kernel(struct perf_callchain_entry_ctx *entry, struct pt_regs *re
> > > > > > return;
> > > > > > }
> > > > > >
> > > > > > - if (perf_callchain_store(entry, regs->ip))
> > > > > > + if (valid_perf_registers(regs)) {
> > > > > > + if (perf_callchain_store(entry, regs->ip))
> > > > > > + return;
> > > > > > + unwind_start(&state, current, regs, NULL);
> > > > > > + } else if (regs->sp) {
> > > > > > + unwind_start(&state, current, NULL, (unsigned long *)regs->sp);
> > > > > > + } else {
> > > > > > return;
> > > > > > + }
> > > > >
> > > > > AFAICT if we, by pure accident, end up with !bp for ORC, then we
> > > > > initialize the unwind wrong.
> > > > >
> > > > > Note that @regs is mostly trivially correct, except for that tracepoint
> > > > > case. So I don't think we should magic here.
> > > >
> > > > Ah, I didn't quite understand this code before, and I still don't
> > > > really, but I guess the issue is that @regs can be either real or fake.
> > > >
> > > > In the real @regs case, we just want to always unwind starting from
> > > > regs->sp.
> > > >
> > > > But in the fake @regs case, we should instead unwind from the current
> > > > frame, skipping all frames until we hit the fake regs->sp. Because
> > > > starting from fake/incomplete regs is most likely going to cause
> > > > problems with ORC (or DWARF for other arches).
> > > >
> > > > The idea of a fake regs is fragile and confusing. Is it possible to
> > > > just pass in the "skip" stack pointer directly instead? That should
> > > > work for both FP and non-FP. And I _think_ there's no need to ever
> > > > capture regs->bp anyway -- the stack pointer should be sufficient.
> > >
> > > Hi, that will break some other usage, if perf_callchain_kernel is
> > > called but it won't unwind to the callsite (could be produced by
> > > attach an ebpf call to kprobe), things will also go wrong. It should
> > > start with given registers when the register is valid.
> > > And it's true with omit frame pointer BP value could be anything, so 0
> > > is also valid, I think I need to find a better way to tell if we could
> > > start with the registers value or direct start unwinding and skip
> > > until got the stack.
> > >
> >
> > Hi, sorry I might have some misunderstanding. Adding an extra argument
> > (eg. skip_sp) to indicate if it should just unwind from the current
> > frame, and use SP as the "skip mark", should work well.
> >
> > And I also think the "fake"/"real" reg is fragile, could we abuse
> > another eflag (just like PERF_EFLAGS_EXACT) to indicate the regs are
> > partially dumped fake registers? So perf_callchain_kernel just check
> > if it's a "partial registers", and in such case it can start unwinding
> > and skip until it get to SP. This make it easier to tell if the
> > registers are "fake".
>
> If you do the regs->eflags thing to mark the regs as fake in
> (perf_arch_fetch_caller_regs()), then I don't think skip_sp would be
> needed, because regs->sp could probably mark the skip point.
>
> Instead I was actually hoping we could get rid of fake regs and
> perf_arch_fetch_caller_regs() altogether, because it's a nasty hack.
> But I don't know what else those fake regs are used for.
>
Despite it's hacky, it seems not necessary to dump every register. And
is there a straight way to get caller's regs in the trace point? It
seems more trouble some. Or if we just use the regs inside the
tracepoint, but it would need even more workaround (eg. unwind one
frame before do anything).
--
Best Regards,
Kairui Song
Powered by blists - more mailing lists