[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0902270859100.25593@gandalf.stny.rr.com>
Date: Fri, 27 Feb 2009 09:43:18 -0500 (EST)
From: Steven Rostedt <rostedt@...dmis.org>
To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
cc: Dominique Toupin <dominique.toupin@...csson.com>,
Mathieu Desnoyers <compudj@...stal.dyndns.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Jason Baron <jbaron@...hat.com>,
Masami Hiramatsu <mhiramat@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
"Frank Ch. Eigler" <fche@...hat.com>, Ingo Molnar <mingo@...e.hu>,
LKML <linux-kernel@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: [PATCH] new irq tracer
On Fri, 27 Feb 2009, KOSAKI Motohiro wrote:
> >
> > On Fri, 27 Feb 2009, KOSAKI Motohiro wrote:
> > > >
> > > > In many cases we don't use Linux real-time, we have many systems that
> > > > are soft-real-time an non real-time Linux is good enough.
> > > >
> > >
> > > Agreed, rt-patch seems off topics. we discuss to mainline kernel.
> >
> > The reason I brought up the rt-patch is that Mathieu is concerned about
> > the small latency that happens when we run stop_machine to switch the nops
> > to pointers to trace functions. This latency can be a couple of millisecs,
> > but I can easily make a normal kernel suffer a couple of millisec latency
> > by simply running hackbench (as a normal user).
> >
> > The point being, I think Mathieu's point about the latency of enabling the
> > function tracer is frivolous. It is a one time shot that happens when the
> > sysadmin purposely enables the function tracing. If this is unacceptable,
> > then the system had better be running the rt patch because it will be
> > hitting this unacceptable latency in normal operation.
>
> ok, I missed last thread. thanks.
>
> I think you talk about "ftrace, x86: make kernel text writable only for
> conversions" thread, right?
> hm, it seems very good discussion.
Yes, and as Mathieu and I finally convinced each other that there are
two different issues there (we both were trying to convince the other
about the same point. Sorta like the two blind man explaining what an
Elephant looks like ;-)
text_poke vs touching kernel mappings
break_points vs stop_machine
The real issue came down to the latter.
>
> My concern is, both your and Mathieu's point is fairly reasonable and
> good direction.
>
> However, I would bring up another viewpoint. so,
>
> - the latency issue can happen actual production server?
>
>
> Mathieu and Dominique explained ACTUAL production server problem.
> but I believe Dominique's server never run hackbench.
>
> In other word, I agree mainline kernel has many latency corner case problems.
> but I don't agree corner case existing indicate no need latency improvement.
> Generally, actual experience indicate it's valuable to hear them requirement.
Then the production system had better not do any load at all. hackbench is
a trivial way to produce the problem, and I used it only as an example
that one can see. But anytime you have a large quantity of requests coming
in, and waking up many threads, you run the risk of taking a 3 or 4ms
interrupts off latency. I've measured this running database applications.
Using 2.6.28, I did some measurements of the function tracer stop_machine.
Simply by using irqsoff tracer:
# wc -l available_filter_functions
19043
That is 19 thousand available trace points in the kernel. All of them are
nops when tracing is disabled. But to activate all of them by running
stop_machine can take a bit of time. It will also add a bit of overhead to
the system when tracing is active simply because you have 19 thousand
trace points!
# cd /debug/tracing
# echo irqsoff > current_tracer
# sysctl kernel.ftrace_enabled
kernel.ftrace_enabled = 1
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=0
kernel.ftrace_enabled = 0
# cat latency_trace > full-disable.txt
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=1
kernel.ftrace_enabled = 1
# cat latency_trace > full-enable.txt
# echo do_fork > set_ftrace_filter
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=0
kernel.ftrace_enabled = 0
# cat latency_trace > one-func-disable.txt
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=1
kernel.ftrace_enabled = 1
# cat latency_trace > one-func-enable.txt
# head available_filter_functions > set_ftrace_filter
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=0
kernel.ftrace_enabled = 0
# cat latency_trace > ten-func-disable.txt
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=1
kernel.ftrace_enabled = 1
# cat latency_trace > ten-func-enable.txt
# head -100 available_filter_functions > set_ftrace_filter
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=0
kernel.ftrace_enabled = 0
# cat latency_trace > hundred-func-disable.txt
# echo 0 > tracing_max_latency
# sysctl kernel.ftrace_enabled=1
kernel.ftrace_enabled = 1
# cat latency_trace > hundred-func-enable.txt
# sysctrl kernel.ftrace_enabled=0
kernel.ftrace_enabled = 0
# echo 0 > tracing_max_latency
# /usr/local/bin/hackbench 50
Time: 18.693
# cat latency_trace > hackbench.txt
The above output traces and the available_filter_functions can be found
at http://rostedt.homelinux.com/traces/
The actual measured latencies are up on the top left corner. The trace
itself might be a little off because it is still recording the trace
after it recorded the latency, and you might see a few more microsecs
in the trace output.
Here's the latencies:
one-disable: 564us
one-enable: 567us
ten-disable: 547us
ten-enable: 549us
Funny how enabling 10 took less than one? Perhaps cache issues.
hundred-disable: 580us
hundred-enable: 581us
full-disable: (19043) 9352us
full-enable: (19043) 19081us
hackbench: 3859us
For some reason, full enabling always seems to take longer than full
disabling. Not sure why.
Now lets look at this data. You can enable 100 different call sites in
just over a half a millisec. The latency that Mathieu is complaining
about is just a half a millisec. That I can guarantee you, a production
system doing any kind of work, will hit that latency, with any non RT
Linux kernel.
Yes, you could say, "19ms! That is a lot!" Well, do not enable all
19 thousand trace points at once. Just having them enabled will cause a
bit of overhead that is noticeable. Mathieu pointed out that I said I
would not have this done on an active production system. He is right.
Although if something is going wrong with the production system, it may be
worth taking this hit to analyze the situation. Or ever a periodical thing
on off hours with a notice to customers that analysis of the system is
being performed from (2am to 3am EST), expect some degradation in
performance. You can do this. It is a controlled environment.
>
> HOWEVER, I still also agree with your disliking messiness.
> Then, I would suggest we go into next step.
>
> I mean make the patch of Mathieu's plan and measure it. Then
> we can mesure number by number. my number mean,
>
> (1) How much increasing line and messiness
> (2) How much improving latency
>
> that's fairly comparision and typical lkml discussion style, imho.
>
> What do you think?
The problem with the above is that to replace stop_machine, I would need
to implement the breakpoint method. This is by no means a trivial change.
It may take weeks to get right. And for what? To get rid of a half a
millisec latency that can happen at any time during normal operation?
Note, the stop_machine method can be ported to other archs trivially.
David Miller ported it in 20 minutes to Sparc64. A break point scheme
would take a lot more development by each arch maintainer, and risk adding
many more bugs.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists