linux-kernel - Re: [PATCHv2] tracing - avoid soft lockup in trace

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110321121008.GA1979@jolsa.brq.redhat.com>
Date:	Mon, 21 Mar 2011 13:10:08 +0100
From:	Jiri Olsa <jolsa@...hat.com>
To:	Steven Rostedt <rostedt@...dmis.org>
Cc:	Oleg Nesterov <oleg@...hat.com>, fweisbec@...il.com,
	mingo@...hat.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCHv2] tracing - avoid soft lockup in trace_pipe

On Mon, Mar 14, 2011 at 07:06:21PM +0100, Jiri Olsa wrote:
> On Mon, Mar 14, 2011 at 11:20:57AM -0400, Steven Rostedt wrote:
> > On Mon, 2011-03-14 at 16:07 +0100, Jiri Olsa wrote:
> > > So we get back the the while loop with no data added and with
> > > TRACE_TYPE_HANDLED return code, which will keep us in the loop.
> > > 
> > > Not sure this is bug or feature :)
> > > 
> > > 
> > Most bugs are features ;)
> > 
> > honestly, I'd love to rip out all the "bin", "hex" etc formats as they
> > are pretty much deprecated.
> > 
> > But as you pointed out that this can keep us in the inner loop, we need
> > to fix that. I prefer to fix it within the nop print function. Perhaps
> > we need to fill in something. "type: %d", iter->ent->type ?
> 
> sounds ok :) attaching v2 patch
hi, any feedback?

thanks,
jirka

> 
> thanks,
> jirka
> 
> 
> ---
> running following commands:
> 
>   # enable the binary option
>   echo 1 > ./options/bin
>   # disable context info option
>   echo 0 > ./options/context-info 
>   # tracing only events
>   echo 1 > ./events/enable
>   cat trace_pipe
> 
> is causing lockup (in NON preemptive kernels) inside
> tracing_read_pipe function.
> 
> The reason are:
>  - bin/hex/raw output functions for events are set to
>    trace_nop_print function, which prints nothing and
>    returns TRACE_TYPE_HANDLED value
>  - LOST EVENT trace do not handle trace_seq overflow
> 
> These reasons force the while loop in tracing_read_pipe
> function never to break.
> 
> The attached patch fixies handling of lost event trace, and
> changes trace_nop_print to print minimal info, which is needed
> for the correct tracing_read_pipe processing.
> 
> v2 changes:
>  - omit the cond_resched changes by trace_nop_print changes
>  - WARN changed to WARN_ONCE and added info to be able
>    to find out the culprit
> 
> wbr,
> jirka
> 
> 
> Signed-off-by: Jiri Olsa <jolsa@...hat.com>
> ---
>  kernel/trace/trace.c        |   15 ++++++++++++---
>  kernel/trace/trace_output.c |    3 +++
>  2 files changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 9541c27..5af42f4 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -2013,9 +2013,10 @@ enum print_line_t print_trace_line(struct trace_iterator *iter)
>  {
>  	enum print_line_t ret;
>  
> -	if (iter->lost_events)
> -		trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n",
> -				 iter->cpu, iter->lost_events);
> +	if (iter->lost_events &&
> +	    !trace_seq_printf(&iter->seq, "CPU:%d [LOST %lu EVENTS]\n",
> +				 iter->cpu, iter->lost_events))
> +		return TRACE_TYPE_PARTIAL_LINE;
>  
>  	if (iter->trace && iter->trace->print_line) {
>  		ret = iter->trace->print_line(iter);
> @@ -3229,6 +3230,14 @@ waitagain:
>  
>  		if (iter->seq.len >= cnt)
>  			break;
> +
> +		/*
> +		 * Setting the full flag means we reached the trace_seq buffer
> +		 * size and we should leave by partial output condition above.
> +		 * One of the trace_seq_* functions is not used properly.
> +		 */
> +		WARN_ONCE(iter->seq.full, "full flag set for trace type %d",
> +			  iter->ent->type);
>  	}
>  	trace_access_unlock(iter->cpu_file);
>  	trace_event_read_unlock();
> diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
> index 456be90..cf535cc 100644
> --- a/kernel/trace/trace_output.c
> +++ b/kernel/trace/trace_output.c
> @@ -830,6 +830,9 @@ EXPORT_SYMBOL_GPL(unregister_ftrace_event);
>  enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags,
>  				  struct trace_event *event)
>  {
> +	if (!trace_seq_printf(&iter->seq, "type: %d\n", iter->ent->type))
> +		return TRACE_TYPE_PARTIAL_LINE;
> +
>  	return TRACE_TYPE_HANDLED;
>  }
>  
> -- 
> 1.7.1
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/