linux-kernel - Re: [RFC PATCH] perf_counter: dynamically allocate tasks' perf_counter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1242831788.32543.1706.camel@laptop>
Date:	Wed, 20 May 2009 17:03:08 +0200
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Paul Mackerras <paulus@...ba.org>
Cc:	Ingo Molnar <mingo@...e.hu>, linux-kernel@...r.kernel.org,
	Corey Ashford <cjashfor@...ux.vnet.ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH] perf_counter: dynamically allocate tasks'
 perf_counter_context struct

On Wed, 2009-05-20 at 22:28 +1000, Paul Mackerras wrote:
> This replaces the struct perf_counter_context in the task_struct with
> a pointer to a dynamically allocated perf_counter_context struct.  The
> main reason for doing is this is to allow us to transfer a
> perf_counter_context from one task to another when we do lazy PMU
> switching in a later patch.

Looks good!, few comments below.

> +	ctx = task->perf_counter_ctx;
> +	if (!ctx) {
> +		ctx = kmalloc(sizeof(struct perf_counter_context), GFP_KERNEL);
> +		if (!ctx) {
> +			put_task_struct(task);
> +			return ERR_PTR(-ENOMEM);
> +		}
> +		__perf_counter_init_context(ctx, task);
> +		mutex_lock(&task->perf_counter_mutex);
> +		if (!task->perf_counter_ctx) {
> +			task->perf_counter_ctx = ctx;
> +		} else {
> +			kfree(ctx);
> +			ctx = task->perf_counter_ctx;
> +		}
> +		mutex_unlock(&task->perf_counter_mutex);
> +	}

This seems to be the only site where we use ->perf_counter_mutex,
couldn't we simply write this as:

 if (cmpxchg(&task->perf_counter_ctx, NULL, ctx) != NULL)
   kfree(ctx);

and get rid of the mutex?


> +/*
> + * Remove a counter from the lists for its context.
> + * Must be called with counter->mutex and ctx->mutex held.
> + */
>  static void
>  list_del_counter(struct perf_counter *counter, struct perf_counter_context *ctx)


> @@ -3295,6 +3354,8 @@ __perf_counter_exit_task(struct task_struct *child,
>   *
>   * Note: we may be running in child context, but the PID is not hashed
>   * anymore so new counters will not be added.
> + * (XXX not sure that is true when we get called from flush_old_exec.
> + *  -- paulus)
>   */
>  void perf_counter_exit_task(struct task_struct *child)
>  {

Yes, we have a little hole here, but I think it should fixable.

we have:

 do_exit()
 {
   exit_signals(); /* sets PF_EXITING */

   ... (1)

   perf_counter_exit_task();

   ... (2)

   <unhash somewhere>

vs


 find_get_context()


If, in find_get_context() we exclude PF_EXITING tasks under ctx->mutex,
and ensure we take ctx->mutex in perf_counter_exit_task() we should be
good.

Tasks could race and still attach in (1), but that would be ok, as
they'd get cleared out in perf_counter_exit_task(), but not after that
in (2) as they'd be sure to observe PF_EXITING.

Now it appears perf_counter_exit_task() is a little light on locking, so
the below patch adds both ctx->mutex and counter->mutex, as you say it
should when invoking list_del_counter().

Secondly it adds the PF_EXITING test in find_get_context().

(utterly untested and such..)

Almost-Signed-off-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
---
Index: linux-2.6/kernel/perf_counter.c
===================================================================
--- linux-2.6.orig/kernel/perf_counter.c
+++ linux-2.6/kernel/perf_counter.c
@@ -1228,6 +1228,13 @@ static struct perf_counter_context *find
 	ctx = &task->perf_counter_ctx;
 	ctx->task = task;
 
+	mutex_lock(&ctx->mutex);
+	if (task->state & PF_EXITING) {
+		mutex_unlock(&ctx->mutex);
+		return ERR_PTR(-ESRCH);
+	}
+	mutex_unlock(&ctx->mutex);
+
 	/* Reuse ptrace permission checks for now. */
 	if (!ptrace_may_access(task, PTRACE_MODE_READ)) {
 		put_context(ctx);
@@ -3292,42 +3299,29 @@ __perf_counter_exit_task(struct task_str
 			 struct perf_counter_context *child_ctx)
 {
 	struct perf_counter *parent_counter;
+	struct perf_cpu_context *cpuctx;
+	unsigned long flags;
 
 	/*
-	 * If we do not self-reap then we have to wait for the
-	 * child task to unschedule (it will happen for sure),
-	 * so that its counter is at its final count. (This
-	 * condition triggers rarely - child tasks usually get
-	 * off their CPU before the parent has a chance to
-	 * get this far into the reaping action)
-	 */
-	if (child != current) {
-		wait_task_inactive(child, 0);
-		update_counter_times(child_counter);
-		list_del_counter(child_counter, child_ctx);
-	} else {
-		struct perf_cpu_context *cpuctx;
-		unsigned long flags;
-
-		/*
-		 * Disable and unlink this counter.
-		 *
-		 * Be careful about zapping the list - IRQ/NMI context
-		 * could still be processing it:
-		 */
-		local_irq_save(flags);
-		perf_disable();
+	 * Disable and unlink this counter.
+	 *
+	 * Be careful about zapping the list - IRQ/NMI context
+	 * could still be processing it:
+	 */
+	local_irq_save(flags);
+	perf_disable();
 
-		cpuctx = &__get_cpu_var(perf_cpu_context);
+	cpuctx = &__get_cpu_var(perf_cpu_context);
 
-		group_sched_out(child_counter, cpuctx, child_ctx);
-		update_counter_times(child_counter);
+	group_sched_out(child_counter, cpuctx, child_ctx);
+	update_counter_times(child_counter);
 
-		list_del_counter(child_counter, child_ctx);
+	perf_enable();
+	local_irq_restore(flags);
 
-		perf_enable();
-		local_irq_restore(flags);
-	}
+	mutex_lock(&child_counter->mutex);
+	list_del_counter(child_counter, child_ctx);
+	mutex_unlock(&child_counter->mutex);
 
 	parent_counter = child_counter->parent;
 	/*
@@ -3359,6 +3353,8 @@ void perf_counter_exit_task(struct task_
 	if (likely(!child_ctx->nr_counters))
 		return;
 
+	mutex_lock(&child_ctx->mutex);
+
 again:
 	list_for_each_entry_safe(child_counter, tmp, &child_ctx->counter_list,
 				 list_entry)
@@ -3371,6 +3367,8 @@ again:
 	 */
 	if (!list_empty(&child_ctx->counter_list))
 		goto again;
+
+	mutex_unlock(&child_ctx->mutex);
 }
 
 /*


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/