lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 23 Oct 2008 19:23:14 -0400 (EDT)
From:	Steven Rostedt <rostedt@...dmis.org>
To:	LKML <linux-kernel@...r.kernel.org>
cc:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Randy Dunlap <randy.dunlap@...cle.com>
Subject: [PATCH] ftrace: update txt document


A lot of changes have gone into ftrace. This patch updates
the ftrace.txt document.

Signed-off-by: Steven Rostedt <srostedt@...hat.com>
---
 Documentation/ftrace.txt |  145 +++++++++++++++++++----------------------------
 1 file changed, 59 insertions(+), 86 deletions(-)

Index: linux-tip.git/Documentation/ftrace.txt
===================================================================
--- linux-tip.git.orig/Documentation/ftrace.txt	2008-08-14 13:54:58.000000000 -0400
+++ linux-tip.git/Documentation/ftrace.txt	2008-10-23 19:20:30.000000000 -0400
@@ -94,23 +94,19 @@ of ftrace. Here is a list of some of the
 		only be recorded if the latency is greater than
 		the value in this file. (in microseconds)
 
-  trace_entries : This sets or displays the number of trace
-		entries each CPU buffer can hold. The tracer buffers
-		are the same size for each CPU. The displayed number
-		is the size of the CPU buffer and not total size. The
+  trace_entries : This sets or displays the number of bytes each CPU
+		buffer can hold. The tracer buffers are the same size
+		for each CPU. The displayed number is the size of the
+		 CPU buffer and not total size of all buffers. The
 		trace buffers are allocated in pages (blocks of memory
 		that the kernel uses for allocation, usually 4 KB in size).
-		Since each entry is smaller than a page, if the last
-		allocated page has room for more entries than were
-		requested, the rest of the page is used to allocate
-		entries.
+		If the last page allocated has room for more bytes
+		than requested, the rest of the page is used to allocate.
+		(Note, the size may not be a multiple of the page size due
+		to buffer managment overhead).
 
 		This can only be updated when the current_tracer
-		is set to "none".
-
-		NOTE: It is planned on changing the allocated buffers
-		      from being the number of possible CPUS to
-		      the number of online CPUS.
+		is set to "nop".
 
   tracing_cpumask : This is a mask that lets the user only trace
 		on specified CPUS. The format is a hex string
@@ -130,13 +126,10 @@ of ftrace. Here is a list of some of the
 		be traced. If a function exists in both set_ftrace_filter
 		and set_ftrace_notrace,	the function will _not_ be traced.
 
-  available_filter_functions : When a function is encountered the first
-		time by the dynamic tracer, it is recorded and
-		later the call is converted into a nop. This file
-		lists the functions that have been recorded
-		by the dynamic tracer and these functions can
-		be used to set the ftrace filter by the above
-		"set_ftrace_filter" file. (See the section "dynamic ftrace"
+  available_filter_functions : This lists the functions that ftrace
+		has processed and can trace. These are the function
+		names that you can pass to "set_ftrace_filter" or
+		"set_ftrace_notrace". (See the section "dynamic ftrace"
 		below for more details).
 
 
@@ -145,7 +138,7 @@ The Tracers
 
 Here is the list of current tracers that may be configured.
 
-  ftrace - function tracer that uses mcount to trace all functions.
+  function - function tracer that uses mcount to trace all functions.
 
   sched_switch - traces the context switches between tasks.
 
@@ -166,8 +159,8 @@ Here is the list of current tracers that
 		the highest priority task to get scheduled after
 		it has been woken up.
 
-  none - This is not a tracer. To remove all tracers from tracing
-		simply echo "none" into current_tracer.
+  nop - This is not a tracer. To remove all tracers from tracing
+		simply echo "nop" into current_tracer.
 
 
 Examples of using the tracer
@@ -182,7 +175,7 @@ Output format:
 Here is an example of the output format of the file "trace"
 
                              --------
-# tracer: ftrace
+# tracer: function
 #
 #           TASK-PID   CPU#    TIMESTAMP  FUNCTION
 #              | |      |          |         |
@@ -192,7 +185,7 @@ Here is an example of the output format 
                              --------
 
 A header is printed with the tracer name that is represented by the trace.
-In this case the tracer is "ftrace". Then a header showing the format. Task
+In this case the tracer is "function". Then a header showing the format. Task
 name "bash", the task PID "4251", the CPU that it was running on
 "01", the timestamp in <secs>.<usecs> format, the function name that was
 traced "path_put" and the parent function that called this function
@@ -1000,22 +993,20 @@ is the stack for the hard interrupt. Thi
 has been set. We do not see the 'N' until we switch back to the task's
 assigned stack.
 
-ftrace
-------
+function
+--------
 
-ftrace is not only the name of the tracing infrastructure, but it
-is also a name of one of the tracers. The tracer is the function
-tracer. Enabling the function tracer can be done from the
-debug file system. Make sure the ftrace_enabled is set otherwise
-this tracer is a nop.
+This tracer is the function tracer. Enabling the function tracer
+can be done from the debug file system. Make sure the ftrace_enabled is
+set otherwise this tracer is a nop.
 
  # sysctl kernel.ftrace_enabled=1
- # echo ftrace > /debug/tracing/current_tracer
+ # echo function > /debug/tracing/current_tracer
  # echo 1 > /debug/tracing/tracing_enabled
  # usleep 1
  # echo 0 > /debug/tracing/tracing_enabled
  # cat /debug/tracing/trace
-# tracer: ftrace
+# tracer: function
 #
 #           TASK-PID   CPU#    TIMESTAMP  FUNCTION
 #              | |      |          |         |
@@ -1037,10 +1028,10 @@ this tracer is a nop.
 [...]
 
 
-Note: ftrace uses ring buffers to store the above entries. The newest data
-may overwrite the oldest data. Sometimes using echo to stop the trace
-is not sufficient because the tracing could have overwritten the data
-that you wanted to record. For this reason, it is sometimes better to
+Note: function tracer uses ring buffers to store the above entries.
+The newest data may overwrite the oldest data. Sometimes using echo to
+stop the trace is not sufficient because the tracing could have overwritten
+the data that you wanted to record. For this reason, it is sometimes better to
 disable tracing directly from a program. This allows you to stop the
 tracing at the point that you hit the part that you are interested in.
 To disable the tracing directly from a C program, something like following
@@ -1074,18 +1065,30 @@ every kernel function, produced by the -
 of pointing to a simple return. (Enabling FTRACE will include the
 -pg switch in the compiling of the kernel.)
 
-When dynamic ftrace is initialized, it calls kstop_machine to make
-the machine act like a uniprocessor so that it can freely modify code
-without worrying about other processors executing that same code.  At
-initialization, the mcount calls are changed to call a "record_ip"
-function.  After this, the first time a kernel function is called,
-it has the calling address saved in a hash table.
-
-Later on the ftraced kernel thread is awoken and will again call
-kstop_machine if new functions have been recorded. The ftraced thread
-will change all calls to mcount to "nop".  Just calling mcount
-and having mcount return has shown a 10% overhead. By converting
-it to a nop, there is no measurable overhead to the system.
+At compile time every C file object is run through the
+recordmcount.pl script (located in the scripts directory). This
+script will process the C object using objdump to find all the
+locations in the .text section that call mcount. (Note, only
+the .text section is processed, since processing other sections
+like .init.text may cause races due to those sections being freed).
+
+A new section called "__mcount_loc" is created that holds references
+to all the mcount call sites in the .text section. This section is
+compiled back into the original object. The final linker will add
+all these references into a single table.
+
+On boot up, before SMP is initialized, the dynamic ftrace code
+scans this table and updates all the locations into nops. It also
+records the locations, which are added to the available_filter_functions
+list.  Modules are processed as they are loaded and before they are
+executed.  They also remove their functions from the list when they
+are removed.
+
+When tracing is enabled, kstop_machine is called to prevent races
+with the CPUS executing code being modified (which can cause the
+CPU to do undesireable things), and the nops are patched back
+to calls. But this time, they do not call mcount (which is just
+a function stub). They now call into the ftrace infrastructure.
 
 One special side-effect to the recording of the functions being
 traced is that we can now selectively choose which functions we
@@ -1248,36 +1251,6 @@ Produces:
 
 We can see that there's no more lock or preempt tracing.
 
-ftraced
--------
-
-As mentioned above, when dynamic ftrace is configured in, a kernel
-thread wakes up once a second and checks to see if there are mcount
-calls that need to be converted into nops. If there are not any, then
-it simply goes back to sleep. But if there are some, it will call
-kstop_machine to convert the calls to nops.
-
-There may be a case in which you do not want this added latency.
-Perhaps you are doing some audio recording and this activity might
-cause skips in the playback. There is an interface to disable
-and enable the "ftraced" kernel thread.
-
- # echo 0 > /debug/tracing/ftraced_enabled
-
-This will disable the calling of kstop_machine to update the
-mcount calls to nops. Remember that there is a large overhead
-to calling mcount. Without this kernel thread, that overhead will
-exist.
-
-If there are recorded calls to mcount, any write to the ftraced_enabled
-file will cause the kstop_machine to run. This means that a
-user can manually perform the updates when they want to by simply
-echoing a '0' into the ftraced_enabled file.
-
-The updates are also done at the beginning of enabling a tracer
-that uses ftrace function recording.
-
-
 trace_pipe
 ----------
 
@@ -1286,14 +1259,14 @@ on the tracing is different. Every read 
 This means that subsequent reads will be different. The trace
 is live.
 
- # echo ftrace > /debug/tracing/current_tracer
+ # echo function > /debug/tracing/current_tracer
  # cat /debug/tracing/trace_pipe > /tmp/trace.out &
 [1] 4153
  # echo 1 > /debug/tracing/tracing_enabled
  # usleep 1
  # echo 0 > /debug/tracing/tracing_enabled
  # cat /debug/tracing/trace
-# tracer: ftrace
+# tracer: function
 #
 #           TASK-PID   CPU#    TIMESTAMP  FUNCTION
 #              | |      |          |         |
@@ -1314,7 +1287,7 @@ is live.
 
 Note, reading the trace_pipe file will block until more input is added.
 By changing the tracer, trace_pipe will issue an EOF. We needed
-to set the ftrace tracer _before_ cating the trace_pipe file.
+to set the function tracer _before_ cating the trace_pipe file.
 
 
 trace entries
@@ -1331,10 +1304,10 @@ number of entries.
 65620
 
 Note, to modify this, you must have tracing completely disabled. To do that,
-echo "none" into the current_tracer. If the current_tracer is not set
-to "none", an EINVAL error will be returned.
+echo "nop" into the current_tracer. If the current_tracer is not set
+to "nop", an EINVAL error will be returned.
 
- # echo none > /debug/tracing/current_tracer
+ # echo nop > /debug/tracing/current_tracer
  # echo 100000 > /debug/tracing/trace_entries
  # cat /debug/tracing/trace_entries
 100045

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ