lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.1.10.0810301447490.15853@gandalf.stny.rr.com>
Date:	Thu, 30 Oct 2008 15:17:45 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Ingo Molnar <mingo@...e.hu>
cc:	Frédéric Weisbecker <fweisbec@...il.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] Make ftrace able to trace function return


On Thu, 30 Oct 2008, Ingo Molnar wrote:
> 
> OTOH, dyn-ftrace changes the picture dramatically, with its NOP 
> insertion and opt-in tricks. Still, one more 5-byte NOP in every 
> function is still something not to be done lightly.

I originally wanted to use -finstrument-functions but looking into it 
caused some problems.

> In that sense your mcount enhancement is better, as it does not 
> increase the default (single NOP) cost. It can also be 
> enabled/disabled dynamically in addition to the 'half-way profiling' 
> mcount solution we have today. So i like it at first sight - if it can 
> be made stable ;-)

Lets take a simple C file called traceme.c:

---
static int x;

void trace_me(void)
{
        x++;
}
---

Normal compiling of:

gcc -c traceme.c

produces:

00000000 <trace_me>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   a1 00 00 00 00          mov    0x0,%eax
                        4: R_386_32     .bss
   8:   83 c0 01                add    $0x1,%eax
   b:   a3 00 00 00 00          mov    %eax,0x0
                        c: R_386_32     .bss
  10:   5d                      pop    %ebp
  11:   c3                      ret    


With

gcc -c -pg traceme.c

00000000 <trace_me>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   e8 fc ff ff ff          call   4 <trace_me+0x4>
                        4: R_386_PC32   mcount
   8:   a1 00 00 00 00          mov    0x0,%eax
                        9: R_386_32     .bss
   d:   83 c0 01                add    $0x1,%eax
  10:   a3 00 00 00 00          mov    %eax,0x0
                        11: R_386_32    .bss
  15:   5d                      pop    %ebp
  16:   c3                      ret    


The only difference between the two is an added "call mcount".
5 byte op to replace with dynamic ftrace.

But now lets look at:

gcc -c -finstrument-functions traceme.c

00000000 <trace_me>:
   0:   55                      push   %ebp
   1:   89 e5                   mov    %esp,%ebp
   3:   83 ec 18                sub    $0x18,%esp
   6:   8b 45 04                mov    0x4(%ebp),%eax
   9:   89 44 24 04             mov    %eax,0x4(%esp)
   d:   c7 04 24 00 00 00 00    movl   $0x0,(%esp)
                        10: R_386_32    trace_me
  14:   e8 fc ff ff ff          call   15 <trace_me+0x15>
                        15: R_386_PC32  __cyg_profile_func_enter
  19:   a1 00 00 00 00          mov    0x0,%eax
                        1a: R_386_32    .bss
  1e:   83 c0 01                add    $0x1,%eax
  21:   a3 00 00 00 00          mov    %eax,0x0
			22: R_386_32	.bss
  26:	8b 45 04             	mov    0x4(%ebp),%eax
  29:	89 44 24 04          	mov    %eax,0x4(%esp)
  2d:	c7 04 24 00 00 00 00 	movl   $0x0,(%esp)
			30: R_386_32	trace_me
  34:	e8 fc ff ff ff       	call   35 <trace_me+0x35>
			35: R_386_PC32	__cyg_profile_func_exit
  39:	c9                   	leave  
  3a:	c3                   	ret    

Here we see that

	mov	%eax, 0x4(%esp)
	movl	$trace_me,(%esp)
	call	_cyg_profile_func_enter

is added at the beginning and

	mov	%eax,0x4(%esp)
	mov	$trace_me,(%esp)
	call	__cyg_profile_func_exit

is added at the end.

This is not 5 extra bytes but 27 extra bytes for a total of 32 bytes
at every function. Also note that this also adds these calls to inline 
functions as well. We could easly stop that by adding "notrace" to the
inline define (which I've done).

But this would make the patching a bit more difficult (not impossible).
But it will bloat the image quite a bit.

I've thought about adding an option to enable this over -pg, which would
be doable, but I wanted to let the current code get settled before doing 
so.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ