[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101110154430.GA1454@redhat.com>
Date: Wed, 10 Nov 2010 16:44:30 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
Alan Stern <stern@...land.harvard.edu>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Ingo Molnar <mingo@...e.hu>, Paul Mackerras <paulus@...ba.org>,
Prasad <prasad@...ux.vnet.ibm.com>,
Roland McGrath <roland@...hat.com>,
linux-kernel@...r.kernel.org
Subject: Re: Q: perf_event && event->owner
On 11/10, Peter Zijlstra wrote:
>
> On Tue, 2010-11-09 at 19:57 +0100, Oleg Nesterov wrote:
> > Either sys_perf_open() should do get_task_struct() like we currently
> > do, or perf_event_exit_task() should clear event->owner and then
> > perf_release() should do something like
> >
> > rcu_read_lock();
> > owner = event->owner;
> > if (owner)
> > get_task_struct(owner);
> > rcu_read_unlock();
> >
> > if (owner) {
> > mutex_lock(&event->owner->perf_event_mutex);
> > list_del_init(&event->owner_entry);
> > mutex_unlock(&event->owner->perf_event_mutex);
> > put_task_struct(owner);
> > }
> >
> > Probably this can be simplified...
>
> I think that's still racy, suppose we do:
>
> void perf_event_exit_task(struct task_struct *child)
> {
> struct perf_event *event, *tmp;
> int ctxn;
>
> mutex_lock(&child->perf_event_mutex);
> list_for_each_entry_safe(event, tmp, &child->perf_event_list,
> owner_entry) {
> event->owner = NULL;
> list_del_init(&event->owner_entry);
> }
> mutex_unlock(&child->perf_event_mutex);
>
> for_each_task_context_nr(ctxn)
> perf_event_exit_task_context(child, ctxn);
> }
>
>
> and the close() races with an exit, then couldn't we observe
> event->owner after the last put_task_struct()?
I think no. Note that we do not just free task_struct via rcu callback.
Instead, delayed_put_task_struct() drops the (may be) last reference.
But the code is racy, yes. owner != NULL case is fine. But
perf_release() can see event->owner == NULL before list_del() was
completed. perf_event_exit_task() needs wmb() in between, I think.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists