lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 19 Sep 2006 10:17:53 -0700
From:	Martin Bligh <mbligh@...gle.com>
To:	prasanna@...ibm.com
CC:	Andrew Morton <akpm@...l.org>,
	"Frank Ch. Eigler" <fche@...hat.com>, Ingo Molnar <mingo@...e.hu>,
	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>,
	Paul Mundt <lethal@...ux-sh.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Jes Sorensen <jes@....com>, Tom Zanussi <zanussi@...ibm.com>,
	Richard J Moore <richardj_moore@...ibm.com>,
	Michel Dagenais <michel.dagenais@...ymtl.ca>,
	Christoph Hellwig <hch@...radead.org>,
	Greg Kroah-Hartman <gregkh@...e.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	William Cohen <wcohen@...hat.com>, ltt-dev@...fik.org,
	systemtap@...rces.redhat.com, Alan Cox <alan@...rguk.ukuu.org.uk>
Subject: Re: [PATCH] Linux Kernel Markers

>>>>It seems like all we'd need to do
>>>>is "list all references to function, freeze kernel, update all
>>>>references, continue"
>>>
>>>
>>>"overwrite first 5 bytes of old function with `jmp new_function'".
>>
>>Yes, that's simple. but slower, as you have a double jump. Probably
>>a damned sight faster than int3 though.
> 
> 
> The advantage of using int3 over jmp to launch the instrumented
> module is that int3 (or breakpoint in most architectures) is an
> atomic operation to insert.

Ah, good point. Though ... how much do we care what the speed of
insertion/removal actually is? If we can tolerate it being slow,
then just sync everyone up in an IPI to freeze them out whilst
doing the insert.

> I am getting some more ideas...
>                                                                                                                                                
> 1. Copy the original functions, instrument them and insert them as
> a part of kernel module with different name prefix.
> 2. Insert breakpoint only on those routines at runtime.
> 3. When the breakpoint gets hit, change the instruction pointer to
> the instrumented routine.  No need to single step at all.

Surely this still carries the overhead of doing the breakpoint,
which was part of what we were trying to get away from? I suppose
we get more flexibility this way. Or does the slowness not actually
come from the int3, but only the single-stepping?

How about we combine all three ideas together ...

1. Load modified copy of the function in question.
2. overwrite the first instruction of the routine with an int3 that
does what you say (atomically)
3. Then overwrite the second instruction with a jump that's faster
4. Now atomically overwrite the int3 with a nop, and let the jump
take over.

> Adv:
> Can be enabled/disabled dynamically by inserting/removing
> breakpoints.  No overhead of single stepping.
> No restriction of running the handler in interrupt context.
> You can have pre-compiled instrumented routines.
> This mechanism can be used for pre-defined set of routines and for
> arbiratory probe points, you can use kprobes/jprobes/systemtap.
> No need to be super-user for predefined breakpoints.
>                                                                                                                                                
> Dis:
> Maintainence of the code, since it can code base need to be
> duplicated and instrumented.

CONFIG_FOO_BAR .... turn it on or off to turn on the instrumentation.
compiled out by default. Compiled in when making the tracing functions.

> The above idea is similar to runtime or dynamic patching, but here we
> use int3(breakpoint) rather than jump instruction.

Depends what we're trying to fix. I was trying to fix two things:

1. Flexibility - kprobes seem unable to access all local variables etc
easily, and go anywhere inside the function. Plus keeping low overhead
for doing things like keeping counters in a function (see previous
example I mentioned for counting pages in shrink_list).

2. Overhead of the int3, which was allegedly 1000 cycles or so, though
faster after Ingo had played with it, it's still significant.

M.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ