lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090417030530.GB26612@Krystal>
Date:	Thu, 16 Apr 2009 23:05:30 -0400
From:	Mathieu Desnoyers <compudj@...stal.dyndns.org>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
Cc:	Steven Rostedt <rostedt@...dmis.org>, linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...e.hu>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Theodore Tso <tytso@....edu>,
	Arjan van de Ven <arjan@...radead.org>,
	Christoph Hellwig <hch@....de>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Zhaolei <zhaolei@...fujitsu.com>, Li Zefan <lizf@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	"Frank Ch. Eigler" <fche@...stic.org>,
	Tom Zanussi <tzanussi@...il.com>,
	Jiaying Zhang <jiayingz@...gle.com>,
	Michael Rubin <mrubin@...gle.com>,
	Martin Bligh <mbligh@...gle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Neil Horman <nhorman@...driver.com>,
	Eduard - Gabriel Munteanu <eduard.munteanu@...ux360.ro>,
	Pekka Enberg <penberg@...helsinki.fi>
Subject: [PATCH] tracepoints : let subsystem nop-out the tracepoints at
	build time

* Jeremy Fitzhardinge (jeremy@...p.org) wrote:
> Mathieu Desnoyers wrote:
>> "all this code" is actually :
>>
>>                rcu_read_lock_sched_notrace();                          \
>>                 it_func = rcu_dereference((tp)->funcs);                 \
>>                 if (it_func) {                                          \
>>                         do {                                            \
>>                                 ((void(*)(proto))(*it_func))(args);     \
>>                         } while (*(++it_func));                         \
>>                 }                                                       \
>>                 rcu_read_unlock_sched_notrace();                        \
>>
>> Which does nothing more than disabling preemption and a for loop to
>> call all the tracepoint handlers. I don't see the big win in laying out
>> the stack to call this code out-of-line; we would just remove the
>> preempt disable and the loop, which are minimal compared to most
>> call stacks.
>>   
>
> Well, look at it from my perspective:  Ingo has been repeatedly beating  
> me up for the overhead pvops adds to a native kernel, where it really is  
> just a (direct) function call.  I want to instrument each pvop site with  
> a tracepoint so I can actually work out which calls are being called how  
> frequently to look for new optimisation opportunities.
>
> I would guess the tracepoint code sequence is going to increase the  
> impact of each pvop call site by a fair bit, and that's not counting the  
> effects the extra register pressure will have.  That's a pile of code to  
> add.
>
> And frankly, that's fine by me, because I would expect this degree of  
> introspection to have some performance hit.  But it does make the need  
> for per-subsystem tracing Kconfig entries fairly important, because I  
> don't think this would be acceptable to ship in a non-debug-everything  
> kernel build, even though other tracepoints might be.
>

Agreed. Tracepoints might change the code surrounding the pvops in a
similar fashion as the pvops themselves would change the code.
Therefore, it makes sense to have a Kconfig option to enable the pvops
tracepoints.

In terms of tracepoints (with the DECLARE_TRACE/DEFINE_TRACE semantic),
we could have something like :

in include/trace/pvops.h :

#include <linux/tracepoint.h>

#ifdef CONFIG_PVOPS_TRACEPOINTS

#define DECLARE_PVOPS_TRACE			DECLARE_TRACE
#define DEFINE_PVOPS_TRACE			DEFINE_TRACE
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL_GPL	EXPORT_TRACEPOINT_SYMBOL_GPL
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL		EXPORT_TRACEPOINT_SYMBOL

#else /* !CONFIG_PVOPS_TRACEPOINTS */

#define DECLARE_PVOPS_TRACE			DECLARE_TRACE_NOP
#define DEFINE_PVOPS_TRACE			DEFINE_TRACE_NOP
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL_GPL	EXPORT_TRACEPOINT_SYMBOL_GPL_NOP
#define EXPORT_PVOPS_TRACEPOINT_SYMBOL		EXPORT_TRACEPOINT_SYMBOL_NOP

#endif /* CONFIG_PVOPS_TRACEPOINTS */

And then do the declarations/definitions using the new

DECLARE_PVOPS_TRACE / DEFINE_PVOPS_TRACE.

For that you'll need the patch I am attaching below. I'll let Steven
figure out how to tweak TRACE_EVENT() to support this new tracepoint
feature.

>> So basically, tracepoints are already just doing a function call, with a
>> few more bytes for preempt disable and multiple handler support.
>>
>> About the compiler deciding to put the unlikely branch out-of-line, I've
>> never seen any function calls generated just for the sake of saving
>> those few bytes, that would be crazy of the part of the compiler.
>> However, it can (and should) freely put the stack setup in the coldest
>> cache-lines possible, which are reachable by a near jump.
>>   
>
> No, it wouldn't generate a call.  But if its going to put the code out  
> of line into cold cache-lines, then it may as well generate a call.
>

Jumping out-of-line was somewhat faster than calling a function if I
recall well my performance tests. But that's all been done long ago.

And note that whenever the tracer becomes active, the out-of-line code
of busy tracepoints becomes cache-hot, which means that there is no more
cache line fetch to perform, which leaves the stack setup and other
overhead of function call/return vs 2*jump very measurable.

> Anyway, the important point from my perspective is that tracepoint.h  
> have no #include dependencies beyond linux/types.h (compiler.h, etc).
>

Is preempt.h a problem ?

Here is the patch.

Mathieu


tracepoints : let subsystem nop-out the tracepoints at build time

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
CC: Jeremy Fitzhardinge <jeremy@...p.org>
CC: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
CC: Steven Rostedt <rostedt@...dmis.org>
CC: Ingo Molnar <mingo@...e.hu>
CC: Andrew Morton <akpm@...ux-foundation.org>
CC: Christoph Hellwig <hch@....de>
---
 include/linux/tracepoint.h |   38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

Index: linux.trees.git/include/linux/tracepoint.h
===================================================================
--- linux.trees.git.orig/include/linux/tracepoint.h	2009-04-16 22:40:26.000000000 -0400
+++ linux.trees.git/include/linux/tracepoint.h	2009-04-16 22:40:33.000000000 -0400
@@ -37,6 +37,24 @@ struct tracepoint {
 #define TP_PROTO(args...)	args
 #define TP_ARGS(args...)		args
 
+#define DECLARE_TRACE_NOP(name, proto, args)				\
+	static inline void _do_trace_##name(struct tracepoint *tp, proto) \
+	{ }								\
+	static inline void trace_##name(proto)				\
+	{ }								\
+	static inline int register_trace_##name(void (*probe)(proto))	\
+	{								\
+		return -ENOSYS;						\
+	}								\
+	static inline int unregister_trace_##name(void (*probe)(proto))	\
+	{								\
+		return -ENOSYS;						\
+	}
+
+#define DEFINE_TRACE_NOP(name)
+#define EXPORT_TRACEPOINT_SYMBOL_GPL_NOP(name)
+#define EXPORT_TRACEPOINT_SYMBOL_NOP(name)
+
 #ifdef CONFIG_TRACEPOINTS
 
 /*
@@ -95,23 +113,11 @@ extern void tracepoint_update_probe_rang
 	struct tracepoint *end);
 
 #else /* !CONFIG_TRACEPOINTS */
-#define DECLARE_TRACE(name, proto, args)				\
-	static inline void _do_trace_##name(struct tracepoint *tp, proto) \
-	{ }								\
-	static inline void trace_##name(proto)				\
-	{ }								\
-	static inline int register_trace_##name(void (*probe)(proto))	\
-	{								\
-		return -ENOSYS;						\
-	}								\
-	static inline int unregister_trace_##name(void (*probe)(proto))	\
-	{								\
-		return -ENOSYS;						\
-	}
 
-#define DEFINE_TRACE(name)
-#define EXPORT_TRACEPOINT_SYMBOL_GPL(name)
-#define EXPORT_TRACEPOINT_SYMBOL(name)
+#define DECLARE_TRACE			DECLARE_TRACE_NOP
+#define DEFINE_TRACE			DEFINE_TRACE_NOP
+#define EXPORT_TRACEPOINT_SYMBOL_GPL	EXPORT_TRACEPOINT_SYMBOL_GPL_NOP
+#define EXPORT_TRACEPOINT_SYMBOL	EXPORT_TRACEPOINT_SYMBOL_NOP
 
 static inline void tracepoint_update_probe_range(struct tracepoint *begin,
 	struct tracepoint *end)


-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ