[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110127221856.GA10539@redhat.com>
Date:	Thu, 27 Jan 2011 23:18:56 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc:	Frederic Weisbecker <fweisbec@...il.com>,
	Ingo Molnar <mingo@...e.hu>,
	Alan Stern <stern@...land.harvard.edu>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Paul Mackerras <paulus@...ba.org>,
	Prasad <prasad@...ux.vnet.ibm.com>,
	Roland McGrath <roland@...hat.com>,
	linux-kernel@...r.kernel.org
Subject: Re: Q: perf_install_in_context/perf_event_enable are racy?
On 01/27, Peter Zijlstra wrote:
>
> On Thu, 2011-01-27 at 17:57 +0100, Oleg Nesterov wrote:
> >
> > > With, however, things are more interesting. 2 seems to be adequately
> > > covered by the patch I just send, the IPI will bail and the next
> > > sched-in of the relevant task will pick matters up. 1 otoh doesn't seem
> > > covered, the IPI will bail, leaving us stranded.
> >
> > Hmm, yes... Strangely, I missed that when I was thinking about in_ctxsw.
> >
> > Perhaps, we can change task_oncpu_function_call() so that it returns
> > -EAGAIN in case it hits in_ctxsw != 0? If the caller sees -EAGAIN, it
> > should always retry even if !ctx->is_active.
>
> That would be very easy to do, we can pass a return value through the
> task_function_call structure.
Yes.
Perhaps task_oncpu_function_call() should retry itself to simplify the
callers. I wonder if we should also retry if rq->curr != p...
Oh. You know, I am starting to think I will never understand this.
Forget about the problems with __ARCH_WANT_INTERRUPT_ON_CTXSW.
perf_install_in_context() does task_oncpu_function_call() and then
	// ctx->is_active == 0
	/*
	 * The lock prevents that this context is scheduled in so we
	 * can add the event safely, if it the call above did not
	 * succeed.
	 */
	if (list_empty(&event->group_entry))
		add_event_to_ctx(event, ctx);
This assumes that the task is not running.
Why? Because (I guess) we assume that either task_oncpu_function_call()
should see task_curr() == T, or if the task becomes running after that
it should see the new ->perf_event_ctxp[ctxn] != NULL. And I do not see
how we can prove this.
If find_get_context() sets the new context, the only guarantee we have
is: perf_event_exit_task() can't miss this context. The task, however,
can be scheduled in and miss the new value in perf_event_ctxp[].
And, task_oncpu_function_call() can equally miss rq->curr == task.
But. I think this all falls into the absolutely theoretical category,
and in the worst case nothing really bad can happen, just event_sched_in()
will be delayed until this task reshedules.
So, I think your patch should fix all problems with schedule. Just it
needs the couple of changes we already discussed:
	- finish_task_switch() should clear rq->in_ctxsw before
	  local_irq_enable()
	- task_oncpu_function_call() (or its callers) should always
	  retry the "if (task_curr(p))" code if ->in_ctxsw is true.
If you think we have other problems here please don't tell me,
I already got lost ;)
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Powered by blists - more mailing lists
 
