[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200326105805.0723cd10325ad301de061743@kernel.org>
Date: Thu, 26 Mar 2020 10:58:05 +0900
From: Masami Hiramatsu <mhiramat@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Adrian Hunter <adrian.hunter@...el.com>,
Ingo Molnar <mingo@...hat.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
Steven Rostedt <rostedt@...dmis.org>,
Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Mathieu Poirier <mathieu.poirier@...aro.org>,
Leo Yan <leo.yan@...aro.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V4 05/13] perf/x86: Add perf text poke events for
kprobes
On Tue, 24 Mar 2020 13:21:50 +0100
Peter Zijlstra <peterz@...radead.org> wrote:
> We optimize only after having already installed a regular probe, that
> is, what we're actually doing here is replacing INT3 with a JMP.d32. But
> the above will make it appear as if we're replacing the original text
> with a JMP.d32. Which doesn't make sense, since we've already poked an
> INT3 there and that poke will have had a corresponding
> perf_event_text_poke(), right? (except you didn't, see below)
>
> At this point we'll already have constructed the optprobe trampoline,
> which contains however much of the original instruction (in whole) as
> will be overwritten by our 5 byte JMP.d32. And IIUC, we'll have a
> perf_event_text_poke() event for the whole of that already -- except I
> can't find that in the patches (again, see below).
Thanks Peter to point it out.
>
> > @@ -454,9 +463,16 @@ void arch_optimize_kprobes(struct list_head *oplist)
> > */
> > void arch_unoptimize_kprobe(struct optimized_kprobe *op)
> > {
> > + u8 old[POKE_MAX_OPCODE_SIZE];
> > + u8 new[POKE_MAX_OPCODE_SIZE] = { op->kp.opcode, };
> > + size_t len = INT3_INSN_SIZE + DISP32_SIZE;
> > +
> > + memcpy(old, op->kp.addr, len);
> > arch_arm_kprobe(&op->kp);
> > text_poke(op->kp.addr + INT3_INSN_SIZE,
> > op->optinsn.copied_insn, DISP32_SIZE);
> > + memcpy(new + INT3_INSN_SIZE, op->optinsn.copied_insn, DISP32_SIZE);
>
> And then this is 'wrong' too. You've not written the original
> instruction, you've just written an INT3.
>
> > + perf_event_text_poke(op->kp.addr, old, len, new, len);
> > text_poke_sync();
> > }
>
>
> So how about something like the below, with it you'll get 6 text_poke
> events:
>
> 1: old0 -> INT3
>
> // kprobe active
>
> 2: NULL -> optprobe_trampoline
> 3: INT3,old1,old2,old3,old4 -> JMP32
>
> // optprobe active
>
> 4: JMP32 -> INT3,old1,old2,old3,old4
> 5: optprobe_trampoline -> NULL
>
> // kprobe active
>
> 6: INT3 -> old0
>
>
>
> Masami, did I get this all right?
Yes, you understand correctly. And there is also boosted kprobe
which runs probe.ainsn.insn directly and jump back to old place.
I guess it will also disturb Intel PT.
0: NULL -> probe.ainsn.insn (if ainsn.boostable && !kp.post_handler)
> 1: old0 -> INT3
>
// boosted kprobe active
>
> 2: NULL -> optprobe_trampoline
> 3: INT3,old1,old2,old3,old4 -> JMP32
>
> // optprobe active
>
> 4: JMP32 -> INT3,old1,old2,old3,old4
// optprobe disabled and kprobe active (this sometimes goes back to 3)
> 5: optprobe_trampoline -> NULL
>
// boosted kprobe active
>
> 6: INT3 -> old0
7: probe.ainsn.insn -> NULL (if ainsn.boostable && !kp.post_handler)
So you'll get 8 events in max.
Adrian, would you also need to trace the buffer which is used for
single stepping? If so, as you did, we need to trace p->ainsn.insn
always.
Thank you,
--
Masami Hiramatsu <mhiramat@...nel.org>
Powered by blists - more mailing lists