[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BEC4DE5.1020101@redhat.com>
Date: Thu, 13 May 2010 15:07:17 -0400
From: Masami Hiramatsu <mhiramat@...hat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
CC: Ingo Molnar <mingo@...e.hu>, lkml <linux-kernel@...r.kernel.org>,
systemtap <systemtap@...rces.redhat.com>,
DLE <dle-develop@...ts.sourceforge.net>,
Ananth N Mavinakayanahalli <ananth@...ibm.com>,
Jim Keniston <jkenisto@...ibm.com>,
Jason Baron <jbaron@...hat.com>
Subject: Re: [PATCH -tip 4/5] kprobes/x86: Use text_poke_smp_batch
Mathieu Desnoyers wrote:
> * Masami Hiramatsu (mhiramat@...hat.com) wrote:
>> Mathieu Desnoyers wrote:
>>> * Masami Hiramatsu (mhiramat@...hat.com) wrote:
>>>> Use text_poke_smp_batch() in optimization path for reducing
>>>> the number of stop_machine() issues.
>>>>
>>>> Signed-off-by: Masami Hiramatsu <mhiramat@...hat.com>
>>>> Cc: Ananth N Mavinakayanahalli <ananth@...ibm.com>
>>>> Cc: Ingo Molnar <mingo@...e.hu>
>>>> Cc: Jim Keniston <jkenisto@...ibm.com>
>>>> Cc: Jason Baron <jbaron@...hat.com>
>>>> Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>>>> ---
>>>>
>>>> arch/x86/kernel/kprobes.c | 37 ++++++++++++++++++++++++++++++-------
>>>> include/linux/kprobes.h | 2 +-
>>>> kernel/kprobes.c | 13 +------------
>>>> 3 files changed, 32 insertions(+), 20 deletions(-)
>>>>
>>>> diff --git a/arch/x86/kernel/kprobes.c b/arch/x86/kernel/kprobes.c
>>>> index 345a4b1..63a5c24 100644
>>>> --- a/arch/x86/kernel/kprobes.c
>>>> +++ b/arch/x86/kernel/kprobes.c
>>>> @@ -1385,10 +1385,14 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
>>>> return 0;
>>>> }
>>>>
>>>> -/* Replace a breakpoint (int3) with a relative jump. */
>>>> -int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op)
>>>> +#define MAX_OPTIMIZE_PROBES 256
>>>
>>> So what kind of interrupt latency does a 256-probes batch generate on the
>>> system ? Are we talking about a few milliseconds, a few seconds ?
>>
>> From my experiment on kvm/4cpu, it took about 3 seconds in average.
>
> That's 3 seconds for multiple calls to stop_machine(). So we can expect
> latencies in the area of few microseconds for each call, right ?
Sorry, my bad. Non tuned kvm guest is so slow...
I've tried to check it again on *bare machine* (4core Xeon 2.33GHz, 4cpu).
I found that even without this patch, optimizing 256 probes took 770us in
average (min 150us, max 3.3ms.)
With this patch, it went down to 90us in average (min 14us, max 324us!)
Isn't it enough low latency? :)
>> With this patch, it went down to 30ms. (x100 faster :))
>
> This is beefing up the latency from few microseconds to 30ms. It sounds like a
> regression rather than a gain to me.
so, it just takes 90us. I hope it is acceptable.
Thank you,
--
Masami Hiramatsu
e-mail: mhiramat@...hat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists