linux-kernel - Re: Q: perf_install_in_context/perf_event

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1296134077.15234.161.camel@laptop>
Date:	Thu, 27 Jan 2011 14:14:37 +0100
From:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...e.hu>,
	Alan Stern <stern@...land.harvard.edu>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Paul Mackerras <paulus@...ba.org>,
	Prasad <prasad@...ux.vnet.ibm.com>,
	Roland McGrath <roland@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: Q: perf_install_in_context/perf_event_enable are racy?

On Wed, 2011-01-26 at 18:53 +0100, Oleg Nesterov wrote:
>         void task_force_schedule(struct task_struct *p)
>         {
>                 struct rq *rq;
> 
>                 preempt_disable();
>                 rq = task_rq(p);
>                 if (rq->curr == p)
>                         wake_up_process(rq->stop);
>                 preempt_enable();
>         }
> 
>         static void
>         perf_install_in_context(struct perf_event_context *ctx,
>                                 struct perf_event *event,
>                                 int cpu)
>         {
>                 struct task_struct *task = ctx->task;
> 
>                 event->ctx = ctx;
> 
>                 if (!task) {
>                         /*
>                          * Per cpu events are installed via an smp call and
>                          * the install is always successful.
>                          */
>                         smp_call_function_single(cpu, __perf_install_in_context,
>                                                  event, 1);
>                         return;
>                 }
> 
>                 for (;;) {
>                         raw_spin_lock_irq(&ctx->lock);
>                         /*
>                          * The lock prevents that this context is
>                          * scheduled in, we can add the event safely.
>                          */
>                         if (!ctx->is_active)
>                                 add_event_to_ctx(event, ctx);
>                         raw_spin_unlock_irq(&ctx->lock);
> 
>                         if (event->attach_state & PERF_ATTACH_CONTEXT) {
>                                 task_force_schedule(task);
>                                 break;
>                         }
> 
>                         task_oncpu_function_call(task, __perf_install_in_context,
>                                                         event);
>                         if (event->attach_state & PERF_ATTACH_CONTEXT)
>                                 break;
>                 }
>         } 

Right, so the fact of introducing extra scheduling makes me feel
uncomfortable... the whole purpose is to observe without perturbing (as
much as possible).

So the whole crux of the matter is adding a ctx to a running process. If
the ctx exists, ->is_active will be tracked properly and much of the
problem goes away.

  rcu_assign_pointer(task->perf_event_ctx[n], new_ctx);
  task_oncpu_function_call(task, __perf_install_in_context, event);

Should I think suffice to get the ctx in sync with the task state, we've
got the following cases:
 1) task is in the middle of scheduling in
 2) task is in the middle of scheduling out
 3) task is running

Without __ARCH_WANT_INTERRUPT_ON_CTXSW everything is boring and works,
1: the IPI will be delayed until 3, 2: the IPI finds another task and
the next schedule in will sort things.

With, however, things are more interesting. 2 seems to be adequately
covered by the patch I just send, the IPI will bail and the next
sched-in of the relevant task will pick matters up. 1 otoh doesn't seem
covered, the IPI will bail, leaving us stranded.

To fix this it seems we need to make task_oncpu_function_call() wait
until the ctx is done, while (cpu_rq(cpu)->in_ctxsw) cpu_relax(); before
sending the IPI like, however that would require adding a few memory
barriers I think... 

/me goes search for implied barriers around there.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/