linux-kernel - Re: [PATCH] ftrace: Add missing check for existing hwlat thread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1533285974.2179.6.camel@linutronix.de>
Date:   Fri, 03 Aug 2018 10:46:14 +0200
From:   Erica Bugden <erica.bugden@...utronix.de>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     linux-kernel@...r.kernel.org, peterz@...radead.org,
        tglx@...utronix.de, anna-maria@...utronix.de, bigeasy@...utronix.de
Subject: Re: [PATCH] ftrace: Add missing check for existing hwlat thread

On Wed, 2018-08-01 at 15:40 -0400, Steven Rostedt wrote:
> On Wed,  1 Aug 2018 12:45:54 +0200
> > Erica Bugden <erica.bugden@...utronix.de> wrote:
> 
> > The hwlat tracer uses a kernel thread to measure latencies. The function
> > that creates this kernel thread, start_kthread(), can be called when the
> > tracer is initialized and when the tracer is explicitly enabled.
> > start_kthread() does not check if there is an existing hwlat kernel
> > thread and will create a new one each time it is called.
> > 
> > This causes the reference to the previous thread to be lost. Without the
> > thread reference, the old kernel thread becomes unstoppable and
> > continues to use CPU time even after the hwlat tracer has been disabled.
> > This problem can be observed when a system is booted with tracing
> > enabled and the hwlat tracer is configured like this:
> > 
> > 	echo hwlat > current_tracer; echo 1 > tracing_on
> > 
> > Add the missing check for an existing kernel thread in start_kthread()
> > to prevent this problem. This function and the rest of the hwlat kernel
> > thread setup and teardown are already serialized because they are called
> > through the tracer core code with trace_type_lock held.
> > 
> > > > Signed-off-by: Erica Bugden <erica.bugden@...utronix.de>
> > ---
> >  kernel/trace/trace_hwlat.c | 3 +++
> >  1 file changed, 3 insertions(+)
> > 
> > diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
> > index d7c8e4e..2d9d36d 100644
> > --- a/kernel/trace/trace_hwlat.c
> > +++ b/kernel/trace/trace_hwlat.c
> > @@ -354,6 +354,9 @@ static int start_kthread(struct trace_array *tr)
> > > >  	struct task_struct *kthread;
> > > >  	int next_cpu;
> >  
> > > > +	if (hwlat_kthread)
> > > > +		return 0;
> > +
> 
> This looks like it is treating the symptom and not the disease.
> 
> > > >  	/* Just pick the first CPU on first iteration */
> > > >  	current_mask = &save_cpumask;
> > > >  	get_online_cpus();
> 
> Can you try this patch?

I tested the patch below and it also fixes the problem.

> 
> -- Steve
> 
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 823687997b01..15862044db05 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -7628,7 +7628,9 @@ rb_simple_write(struct file *filp, const char __user *ubuf,
>  
>  	if (buffer) {
>  		mutex_lock(&trace_types_lock);
> -		if (val) {
> +		if (!!val == tracer_tracing_is_on(tr)) {
> +			val = 0; /* do nothing */
> +		} else if (val) {
>  			tracer_tracing_on(tr);
>  			if (tr->current_trace->start)
>  				tr->current_trace->start(tr);