[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0902231236340.18221@gandalf.stny.rr.com>
Date: Mon, 23 Feb 2009 13:17:43 -0500 (EST)
From: Steven Rostedt <rostedt@...dmis.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc: Andi Kleen <andi@...stfloor.org>, linux-kernel@...r.kernel.org,
Ingo Molnar <mingo@...e.hu>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Arjan van de Ven <arjan@...radead.org>,
Rusty Russell <rusty@...tcorp.com.au>,
"H. Peter Anvin" <hpa@...or.com>,
Steven Rostedt <srostedt@...hat.com>
Subject: Re: [PATCH 4/6] ftrace, x86: make kernel text writable only for
conversions
On Mon, 23 Feb 2009, Mathieu Desnoyers wrote:
> >
> > Hmm, lets see. I simply set a bit in the PTE mappings. There's not many,
> > since a lot are 2M pages, for x86_64. Call stop_machine, and now I can
> > modify 1 or 20,000 locations. Set the PTE bit back. Note, the changing of
> > the bits are only done when CONFIG_DEBUG_RODATA is set.
> >
> > text_poke requires allocating a page. Map the page into memory. Set up a
> > break point.
>
> text_poke does not _require_ a break point. text_poke can work with
> stop_machine.
It can? Doesn't text_poke require allocating pages? The code called by
stop_machine is all atomic. vmap does not give an option to allocate with
GFP_ATOMIC.
> There are two different problems here :
I agree that they are two different problems. The reason I relate them is
because text_poke can not be called from a stop_machine call.
>
> - How you deal with concurrency
> - you use stop machine
> - I use breakpoints
> - How you deal with RO page mappings
> - you change the kernel page flags
> - i use text_poke
>
> Please don't mix those separate concerns.
So you have two different concerns. One is that I use stop_machine,
instead of break points, the other is that I modify all kernel text to
make the change.
Lets look at them separately.
The stop_machine vs. break points.
breakpoints is a cool trick, but is not implemented on all the archs that
dynamic ftrace is.
break points are performed on a running system. This may be lower in
latency tracing when the tracer is started, but can create a large number
of variables that can not all be understood.
stop_machine is quite simple. No need to take traps, no need to handle
what to do when another process runs the code being changed.
When making the hooks, stop_machine can add a bit of a interrupt latency.
But this is only when the hooks are added or removed. Why is this such a
big deal? It is much easier to add the hooks with tracing disabled (via
a simple toggle bit). Then start and stop your tracing by using the toggle
bit. After you are all done, then remove the hooks. Or just keep them
on since they are low overhead anyway (only a few hooks right?)
CONFIG_DEBUG_RODATA (only an x86 issue at the moment)
text_poke vs changing all pages:
You said this is a separate issue than stop_machine. But that is not the
case. text_poke can not be done in an atomic section. This removes it from
being used by stop_machine.
As you said, text_poke only handles the RO/RW issue, not the modifying of
code on the fly. Thus, keeping stop_machine around, we must also not use
text_poke.
I guess this takes the text_poke vs changing all pages out of the
question. While stop_machine is still being used, we can not use
text_poke (without rewriting it).
Also when we want to trace all functions, is it really necessary to vmap
each one at a time? Andi suggested that we could optimise by mapping
larger pages, and finding the ones that share the page. This too would
require a rewrite of text_poke.
> > >
> > > If, in the end, your argument is "the function tracer works as-is now,
> > > and I have no time to change it given it represents too much work" or "I
> > > don't care about your use-cases", I'm OK with that. But please then don't
> > > argue that it's because it's the best technical solution when it isn't.
> >
> > No, I have yet to hear a valuable argument against stop_machine. You are
> > pushing the burden of proof on me, when we have something that does work,
> > on several archs. You want me to redesign the system to be x86 only, and
> > then say, hey, my original code works better.
> >
>
> stop_machine involves high interrupt latency. This is the argument I've
> been repeating for 1-2 emails already. And I have to disagree with you :
> we can do this code generically given the right abstractions
> (BREAKPOINT_INSN* macros I proposed earlier). Is having something that
> "works" your only argument to stop improving it ?
The high interrupt latency only happens at the time we need to hook the
functions. This does not mean it is the time to start the tracing. That
can be done separately.
Your only concern is the stop_machine latency? Then you might as well also
prevent modules, since that uses stop machine too. Again, this happens
only when the tracer hooks are added or removed. This is done at a time
the sys-admin will activate it. It is not a random latency that is
occurred by some timer or other asynchronous event.
>
> > I do not see text_poke being theoretically better. The only reason you
> > given me to use it is because you dislike stop_machine.
> >
>
> There is absolutely no link between stop_machine and text_poke. I argue
> against stop_machine saying that the breakpoint approach is less
> intrusive because it does not involve disabling interrupts for so long,
> and I argue against modifying the kernel page flags because that
> modifies the access rights of the core kernel and modules to RO
> mappings, which is IMO a side-effect that we should eliminate _if we
> can_. Please keep those two concerns separate.
text_poke can not be executed from stop_machine. There's the link. The two
concerns are not separate.
Your concern with stop_machine is that it will cause an interrupt latency
when the sysadmin enables or disables the functions. There exists other
interrupt latencies that can be worst that are asynchronous. Run hackbench
with the irqs off tracer and see for yourself.
-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists