[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100512174840.GA32496@Krystal>
Date: Wed, 12 May 2010 13:48:40 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Masami Hiramatsu <mhiramat@...hat.com>
Cc: Ingo Molnar <mingo@...e.hu>, lkml <linux-kernel@...r.kernel.org>,
systemtap <systemtap@...rces.redhat.com>,
DLE <dle-develop@...ts.sourceforge.net>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
Jim Keniston <jkenisto@...ibm.com>,
Jason Baron <jbaron@...hat.com>
Subject: Re: [PATCH -tip 4/5] kprobes/x86: Use text_poke_smp_batch
* Masami Hiramatsu (mhiramat@...hat.com) wrote:
> Mathieu Desnoyers wrote:
> > * Masami Hiramatsu (mhiramat@...hat.com) wrote:
> >> Mathieu Desnoyers wrote:
> >>> * Masami Hiramatsu (mhiramat@...hat.com) wrote:
> >>>> Use text_poke_smp_batch() in optimization path for reducing
> >>>> the number of stop_machine() issues.
> >>>>
> >>>> Signed-off-by: Masami Hiramatsu <mhiramat@...hat.com>
> >>>> Cc: Ananth N Mavinakayanahalli <ananth@...ibm.com>
> >>>> Cc: Ingo Molnar <mingo@...e.hu>
> >>>> Cc: Jim Keniston <jkenisto@...ibm.com>
> >>>> Cc: Jason Baron <jbaron@...hat.com>
> >>>> Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
> >>>> ---
> >>>>
> >>>> arch/x86/kernel/kprobes.c | 37 ++++++++++++++++++++++++++++++-------
> >>>> include/linux/kprobes.h | 2 +-
> >>>> kernel/kprobes.c | 13 +------------
> >>>> 3 files changed, 32 insertions(+), 20 deletions(-)
> >>>>
> >>>> diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
> >>>> index 345a4b1..63a5c24 100644
> >>>> --- a/arch/x86/kernel/kprobes.c
> >>>> +++ b/arch/x86/kernel/kprobes.c
> >>>> @@ -1385,10 +1385,14 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
> >>>> return 0;
> >>>> }
> >>>>
> >>>> -/* Replace a breakpoint (int3) with a relative jump. */
> >>>> -int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op)
> >>>> +#define MAX_OPTIMIZE_PROBES 256
> >>>
> >>> So what kind of interrupt latency does a 256-probes batch generate on the
> >>> system ? Are we talking about a few milliseconds, a few seconds ?
> >>
> >> From my experiment on kvm/4cpu, it took about 3 seconds in average.
> >
> > That's 3 seconds for multiple calls to stop_machine(). So we can expect
> > latencies in the area of few microseconds for each call, right ?
>
> Theoretically yes.
> But if we register more than 1000 probes at once, it's hard to do
> anything except optimizing a while(more than 10 sec), because
> it stops machine so frequently.
>
> >> With this patch, it went down to 30ms. (x100 faster :))
> >
> > This is beefing up the latency from few microseconds to 30ms. It sounds like a
> > regression rather than a gain to me.
>
> If it is not acceptable, I can add a knob for control how many probes
> optimize/unoptimize at once. Anyway, it is expectable latency (after
> registering/unregistering probes) and it will be small if we put a few probes.
> (30ms is the worst case)
> And if you want, it can be disabled by sysctl.
I think we are starting to see the stop_machine() approach is really limiting
our ability to do even relatively small amount of work without hurting
responsiveness significantly.
What's the current showstopper with the breakpoint-bypass-ipi approach that
solves this issue properly and makes this batching approach unnecessary ?
Thanks,
Mathieu
>
> Thank you,
>
> --
> Masami Hiramatsu
> e-mail: mhiramat@...hat.com
--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists