[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20170108132242.9fe3e09e97de0a29c06178b5@kernel.org>
Date: Sun, 8 Jan 2017 13:22:42 +0900
From: Masami Hiramatsu <mhiramat@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
linux-kernel@...r.kernel.org,
Ananth N Mavinakayanahalli <ananth@...ux.vnet.ibm.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H . Peter Anvin" <hpa@...or.com>,
Andrey Konovalov <andreyknvl@...gle.com>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: [PATCH tip/master v3] kprobes: extable: Identify kprobes'
insn-slots as kernel text area
On Wed, 4 Jan 2017 11:01:02 +0100
Peter Zijlstra <peterz@...radead.org> wrote:
> On Wed, Jan 04, 2017 at 02:06:04PM +0900, Masami Hiramatsu wrote:
> > On Tue, 3 Jan 2017 11:54:02 +0100
> > Peter Zijlstra <peterz@...radead.org> wrote:
>
> > > How many entries should one expect on that list? I spend quite a bit of
> > > time reducing the cost of is_module_text_address() a while back and see
> > > that both ftrace (which actually needs this to be fast) and now
> > > kprobes have linear list walks in here.
> >
> > It depends on how many probes are used and optimized. However, in most
> > cases, there should be one entry (unless user defines optimized probes
> > over 32 on x86, from my experience, it is very rare case. :) )
>
> OK, that's good :-)
OK, I'll add above comment on the patch.
>
> > > I'm assuming the ftrace thing to be mostly empty, since I never saw it
> > > on my benchmarks back then, but it is something Steve should look at I
> > > suppose.
> > >
> > > Similarly, the changelog here should include some talk about worst case
> > > costs.
> >
> > Would you have any good benchmark to measure it?
>
> Not trivially so; what I did was cobble together a debugfs file that
> measures the average of the PMI time in perf_sample_event_took(), and a
> module that has a 10 deep callchain around a while(1) loop. Then perf
> record with callchains for a few seconds.
>
> Generating the callchain does the unwinder thing and ends up calling
> is_kernel_address() lots.
>
> The case I worked on was 0 modules vs 100+ modules in a distro build,
> which was fairly obviously painful back then, since
> is_module_text_address() used a linear lookup.
>
> I'm not sure I still have all those bits, but I can dig around a bit if
> you're interested.
Hmm, I tried to do similar thing (make a test module which has a loop with
10 deep recursive call and save stack-trace) on kvm, but got very unstable
results.
Maybe it needs to run on bare-metal machine.
Thanks,
--
Masami Hiramatsu <mhiramat@...nel.org>
Powered by blists - more mailing lists