[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <efbb68cdc13de6930113f4fa6d70e2fc103b3c32.1271427118.git.jbaron@redhat.com>
Date: Fri, 16 Apr 2010 11:25:02 -0400
From: Jason Baron <jbaron@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: mingo@...e.hu, mathieu.desnoyers@...ymtl.ca, hpa@...or.com,
tglx@...utronix.de, rostedt@...dmis.org, andi@...stfloor.org,
roland@...hat.com, rth@...hat.com, mhiramat@...hat.com,
fweisbec@...il.com, avi@...hat.com, davem@...emloft.net,
vgoyal@...hat.com
Subject: [PATCH 11/11] jump label: add docs
Add jump label docs as: Documentation/jump-label.txt
Signed-off-by: Jason Baron <jbaron@...hat.com>
---
Documentation/jump-label.txt | 140 ++++++++++++++++++++++++++++++++++++++++++
1 files changed, 140 insertions(+), 0 deletions(-)
create mode 100644 Documentation/jump-label.txt
diff --git a/Documentation/jump-label.txt b/Documentation/jump-label.txt
new file mode 100644
index 0000000..4614821
--- /dev/null
+++ b/Documentation/jump-label.txt
@@ -0,0 +1,140 @@
+ Jump Label
+ ----------
+
+By: Jason Baron <jbaron@...hat.com>
+
+
+1) motivation
+
+
+Currently, tracepoints are implemented using a conditional. The conditional
+check requires checking a global variable for each tracepoint. Although,
+the overhead of this check is small, it increases under memory pressure. As we
+increase the number of tracepoints in the kernel this may become more of an
+issue. In addition, tracepoints are often dormant (disabled), and provide no
+direct kernel functionality. Thus, it is highly desirable to reduce their
+impact as much as possible. Although tracepoints are the original motivation
+for this work, other kernel code paths should be able to make use of the jump
+label optimization.
+
+
+2) jump label description/usage
+
+
+gcc (v4.5) adds a new 'asm goto' statement that allows branching to a label.
+http://gcc.gnu.org/ml/gcc-patches/2009-07/msg01556.html
+
+Thus, this patch set introduces an architecture specific 'JUMP_LABEL()' macro as
+follows (x86):
+
+# define JUMP_LABEL_INITIAL_NOP ".byte 0xe9 \n\t .long 0\n\t"
+
+# define JUMP_LABEL(tag, label, cond) \
+ do { \
+ extern const char __jlstrtab_##tag[]; \
+ asm goto("1:" \
+ JUMP_LABEL_INITIAL_NOP
+ ".pushsection __jump_table, \"a\" \n\t"\
+ _ASM_PTR "1b, %l[" #label "], %c0 \n\t" \
+ ".popsection \n\t" \
+ : : "i" (__jlstrtab_##tag) : : label);\
+ } while (0)
+
+
+For architectures that have not yet introduced jump label support its simply:
+
+#define JUMP_LABEL(tag, label, cond) \
+ if (unlikely(cond)) \
+ goto label;
+
+which then can be used as:
+
+ ....
+ JUMP_LABEL(trace_name, trace_label, jump_enabled);
+ printk("not doing tracing\n");
+ return;
+trace_label:
+ printk("doing tracing: %d\n", file);
+ ....
+
+Thus, when tracing is disabled, we simply have a no-op followed by a jump around
+the dormant (disabled) tracing code. The 'JUMP_LABEL()' macro, produces a
+'jump_table' which has the following format:
+
+[instruction address] [jump target] [tracepoint name]
+
+Thus, to enable a tracepoint, we simply patch the 'instruction address' with
+a jump to the 'jump target'.
+
+The call to enable a jump label is: enable_jump_label(trace_name); to disable:
+disable_jump_label(trace_name);
+
+
+3) Jump label analysis (x86)
+
+
+I've tested the performance of using 'get_cycles()' calls around the
+tracepoint call sites. For an Intel Core 2 Quad cpu (in cycles, averages):
+
+ idle after tbench run
+ ---- ----------------
+old code 32 88
+new code 2 4
+
+
+The performance improvement can be reproduced reliably on both Intel and AMD
+hardware.
+
+In terms of code analysis the current code for the disabled case is a 'cmpl'
+followed by a 'je' around the tracepoint code. so:
+
+cmpl - 83 3d 0e 77 87 00 00 - 7 bytes
+je - 74 3e - 2 bytes
+
+total of 9 instruction bytes.
+
+The new code is a 'nopl' followed by a 'jmp'. Thus:
+
+nopl - 0f 1f 44 00 00 - 5 bytes
+jmp - eb 3e - 2 bytes
+
+total of 7 instruction bytes.
+
+So, the new code also accounts for 2 less bytes in the instruction cache per tracepoint.
+
+
+4) architecture interface
+
+
+There are a few functions and macros which arches must implement in order to
+take advantage of this optimization. As previously mentioned, if there is no
+architecture support we simply fall back to a traditional, load, test, and
+jump sequence.
+
+* add "HAVE_ARCH_JUMP_LABEL" to arch/<arch>/Kconfig to indicate support
+
+* #define JUMP_LABEL_NOP_SIZE, arch/x86/include/asm/jump_label.h
+
+* #define "JUMP_LABEL(tag, label, cond)", arch/x86/include/asm/jump_label.h
+
+* add: void arch_jump_label_transform(struct jump_entry *entry, enum jump_label_type type)
+ and
+ const u8 *arch_get_jump_label_nop(void)
+
+ see: arch/x86/kernel/jump_label.c
+
+* finally add a definition for "struct jump_entry". This must be done in a
+ separate .h file, b/c the modpost.c code uses this definition to sort the
+ the jump label tabel in the vmlinux, so that it does not have to be sorted at
+ runtime. see: arch/x86/include/asm/jump_entry.h
+
+
+5) Acknowledgments
+
+
+Thanks to Roland McGrath and Richard Henderson for coming up with the initial
+'asm goto' and jump label design.
+
+Thanks to Mathieu Desnoyers for calling attention to this issue, outlining the
+requirements of a solution, and implementing a solution in the form of the
+"Immediate Values" work.
--
1.7.0.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists