linux-kernel - Re: [PATCH 1/2] ftrace: record command lines at more appropriate moment

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C4F9C3A.6020406@cn.fujitsu.com>
Date:	Wed, 28 Jul 2010 10:55:54 +0800
From:	Li Zefan <lizf@...fujitsu.com>
To:	Ian Munsie <imunsie@....ibm.com>
CC:	linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...hat.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	Jiri Olsa <jolsa@...hat.com>, Tejun Heo <tj@...nel.org>
Subject: Re: [PATCH 1/2] ftrace: record command lines at more appropriate
 moment

Ian Munsie wrote:
> From: Ian Munsie <imunsie@....ibm.com>
> 
> Previously, when tracing was activated through debugfs, regardless of
> which tracing plugin (if any) were activated, the probe_sched_switch and
> probe_sched_wakeup probes from the sched_switch plugin would be
> activated. This appears to have been a hack to use them to record the
> command lines of active processes as they were scheduled.
> 
> That approach would suffer if many processes were being scheduled that
> were not generating events as they would consume entries in the
> saved_cmdlines buffer that could otherwise have been used by other
> processes that were actually generating events.
> 
> It also had the problem that events could be mis-attributed - in the
> common situation of a process forking then execing a new process, the
> change of the process command would not be noticed for some time after
> the exec until the process was next scheduled.
> 
> If the trace was read after the fact this would generally go unnoticed
> because at some point the process would be scheduled and the entry in
> the saved_cmdlines buffer would be updated so that the new command would
> be reported when the trace was eventually read. However, if the events
> were being read live (e.g. through trace_pipe), the events just after
> the exec and before the process was next scheduled would show the
> incorrect command (though the PID would be correct).
> 
> This patch removes the sched_switch hack altogether and instead records
> the commands at a more appropriate moment - at the same time the PID of
> the process is recorded (i.e. when an entry on the ring buffer is
> reserved). This means that the recorded command line is much more likely
> to be correct when the trace is read, either live or after the fact, so
> long as the command line still resides in the saved_cmdlines buffer.
> 
> It is still not guaranteed to be correct in all situations. For instance
> if the trace is read after the fact rather than live (consider events
> generated by a process before an exec - in the below example they would
> be attributed to sleep rather than stealpid since the entry in
> saved_cmdlines would have changed before the event was read), but this
> is no different to the current situation and the alternative would be to
> store the command line with each and every event.
> 
...
> 
> Signed-off-by: Ian Munsie <imunsie@....ibm.com>
> ---
>  kernel/trace/trace.c                 |    3 +--
>  kernel/trace/trace_events.c          |   11 -----------
>  kernel/trace/trace_functions.c       |    2 --
>  kernel/trace/trace_functions_graph.c |    2 --
>  kernel/trace/trace_sched_switch.c    |   10 ----------
>  5 files changed, 1 insertions(+), 27 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 4b1122d..f8458c3 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -1023,8 +1023,6 @@ void tracing_stop(void)
>  	spin_unlock_irqrestore(&tracing_start_lock, flags);
>  }
>  
> -void trace_stop_cmdline_recording(void);
> -
>  static void trace_save_cmdline(struct task_struct *tsk)
>  {
>  	unsigned pid, idx;
> @@ -1112,6 +1110,7 @@ tracing_generic_entry_update(struct trace_entry *entry, unsigned long flags,
>  {
>  	struct task_struct *tsk = current;
>  
> +	tracing_record_cmdline(tsk);

Now this function is called everytime a tracepoint is triggered, so
did you run some benchmarks to see if the performance is improved
or even worse?

Another problem in this patch is, tracing_generic_entry_update() is also
called by perf, but cmdline recoding is not needed in perf.

>  	entry->preempt_count		= pc & 0xff;
>  	entry->pid			= (tsk) ? tsk->pid : 0;
>  	entry->lock_depth		= (tsk) ? tsk->lock_depth : 0;
...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/