lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1265188475-23509-11-git-send-regression-fweisbec@gmail.com>
Date:	Wed,  3 Feb 2010 10:14:34 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Paul Mackerras <paulus@...ba.org>,
	Hitoshi Mitake <mitake@....info.waseda.ac.jp>,
	Li Zefan <lizf@...fujitsu.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Masami Hiramatsu <mhiramat@...hat.com>,
	Jens Axboe <jens.axboe@...cle.com>
Subject: [PATCH 10/11] tracing/perf: Fix lock events recursions in the fast path

There are rcu locked read side areas in the path where we submit
a trace events. And these rcu_read_(un)lock() trigger lock events,
which create recursive events.

One pair in do_perf_sw_event:

__lock_acquire
      |
      |--96.11%-- lock_acquire
      |          |
      |          |--27.21%-- do_perf_sw_event
      |          |          perf_tp_event
      |          |          |
      |          |          |--49.62%-- ftrace_profile_lock_release
      |          |          |          lock_release
      |          |          |          |
      |          |          |          |--33.85%-- _raw_spin_unlock

Another pair in perf_output_begin/end:

__lock_acquire
      |--23.40%-- perf_output_begin
      |          |          __perf_event_overflow
      |          |          perf_swevent_overflow
      |          |          perf_swevent_add
      |          |          perf_swevent_ctx_event
      |          |          do_perf_sw_event
      |          |          perf_tp_event
      |          |          |
      |          |          |--55.37%-- ftrace_profile_lock_acquire
      |          |          |          lock_acquire
      |          |          |          |
      |          |          |          |--37.31%-- _raw_spin_lock

The problem is not that much the trace recursion itself, as we have a
recursion protection already (though it's always wasteful to recurse).
But the trace events are outside the lockdep recursion protection, then
each lockdep event triggers a lock trace, which will trigger two
other lockdep events. Here the recursive lock trace event won't
be taken because of the trace recursion, so the recursion stops there
but lockdep will still analyse these new events:

To sum up, for each lockdep events we have:

	lock_*()
	     |
             trace lock_acquire
                  |
                  ----- rcu_read_lock()
                  |          |
                  |          lock_acquire()
                  |          |
                  |          trace_lock_acquire() (stopped)
                  |          |
		  |          lockdep analyze
                  |
                  ----- rcu_read_unlock()
                             |
                             lock_release
                             |
                             trace_lock_release() (stopped)
                             |
                             lockdep analyze

And you can repeat the above two times as we have two rcu read side
sections when we submit an event.

This is fixed in this pacth by using the non-lockdep versions of
rcu_read_(un)lock.

Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Arnaldo Carvalho de Melo <acme@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Paul Mackerras <paulus@...ba.org>
Cc: Hitoshi Mitake <mitake@....info.waseda.ac.jp>
Cc: Li Zefan <lizf@...fujitsu.com>
Cc: Lai Jiangshan <laijs@...fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@...hat.com>
Cc: Jens Axboe <jens.axboe@...cle.com>
---
 kernel/perf_event.c |   10 +++++-----
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 280ae44..98fd360 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2986,7 +2986,7 @@ int perf_output_begin(struct perf_output_handle *handle,
 		u64			 lost;
 	} lost_event;
 
-	rcu_read_lock();
+	__rcu_read_lock();
 	/*
 	 * For inherited events we send all the output towards the parent.
 	 */
@@ -3051,7 +3051,7 @@ fail:
 	atomic_inc(&data->lost);
 	perf_output_unlock(handle);
 out:
-	rcu_read_unlock();
+	__rcu_read_unlock();
 
 	return -ENOSPC;
 }
@@ -3072,7 +3072,7 @@ void perf_output_end(struct perf_output_handle *handle)
 	}
 
 	perf_output_unlock(handle);
-	rcu_read_unlock();
+	__rcu_read_unlock();
 }
 
 static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
@@ -4116,7 +4116,7 @@ static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
 	struct perf_event_context *ctx;
 
 	cpuctx = &__get_cpu_var(perf_cpu_context);
-	rcu_read_lock();
+	__rcu_read_lock();
 	perf_swevent_ctx_event(&cpuctx->ctx, type, event_id,
 				 nr, nmi, data, regs);
 	/*
@@ -4126,7 +4126,7 @@ static void do_perf_sw_event(enum perf_type_id type, u32 event_id,
 	ctx = rcu_dereference(current->perf_event_ctxp);
 	if (ctx)
 		perf_swevent_ctx_event(ctx, type, event_id, nr, nmi, data, regs);
-	rcu_read_unlock();
+	__rcu_read_unlock();
 }
 
 void __perf_sw_event(u32 event_id, u64 nr, int nmi,
-- 
1.6.2.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ