linux-kernel - Re: [RFC][PATCH] Make ftrace able to trace function return

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Thu, 30 Oct 2008 19:20:56 +0100
From:	Ingo Molnar <mingo@...e.hu>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] Make ftrace able to trace function return


* Frederic Weisbecker <fweisbec@...il.com> wrote:

> Hi.
> 
> First of all, I want to say this patch is absolutely not ready for 
> inclusion. Some parts are really ugly and the thing is only 
> partially functionning.
> 
> It's just an idea or a kind of proof of concept. I just wanted to 
> make ftrace able to measure the time of execution of a function. For 
> that I had to hook both the function call and its return.
> 
> By using mcount, we can hook the function on enter and we can 
> override its return address. So we can catch the time at those two 
> points. The problem comes when a function run concurrently through 
> preemption or smp. We can measure the return time but how to be sure 
> which time capture we had on call since this time could have been 
> captured multiple times. And for the same reason, how to make sure 
> of the return address.
> 
> So the idea is to allocate a general set of slots on which we can 
> save our original return address and the call time. After that we 
> change the return address of the hooked function to jump on a 
> trampoline which will push the offset for us to retrieve the slot on 
> the set for this function call. Then the trampoline will call a 
> return handler that will trace the return time and send all of these 
> informations to a specific tracer. And then the return handler will 
> return to the original return address.
> 
> To determine quickly which slot is free, I use a bitmap of 32 bits. 
> Perhaps it is a bad assumption but I could enlarge it and there is 
> an overrun counter. This is the only point which needs to be 
> protected against concurrent access.
> 
> I made a tracer for this but the problem is that the capture by 
> ftrace will hang the system if we can use several slots. When I 
> dedicate only one free slot, wherever on the set, there is no 
> problem but I miss a lot of calls. So by default on this patch, 
> there is only one slot dedicated on the bitmap.
>
> Don't hesitate to comment this patch made of trashes...

hm, are you aware of the -finstrument-functions feature of GCC?

that feature generates such entry points at build time:

                   void __cyg_profile_func_enter (void *this_fn,
                                                  void *call_site);
                   void __cyg_profile_func_exit  (void *this_fn,
                                                  void *call_site);

this might be faster/cleaner than using a trampoline approach IMO.

OTOH, entry+exit profiling has about double the cost of just entry 
profiling - so maybe there should be some runtime flexibility there. 
Plus the same recordmcount trick should be used to patch up these 
entry points to NOP by default.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/