[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <m1mxuo2i8c.fsf@fess.ebiederm.org>
Date: Mon, 21 Jun 2010 14:22:59 -0700
From: ebiederm@...ssion.com (Eric W. Biederman)
To: paulmck@...ux.vnet.ibm.com
Cc: Oleg Nesterov <oleg@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Don Zickus <dzickus@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Ingo Molnar <mingo@...e.hu>,
Jerome Marchand <jmarchan@...hat.com>,
Mandeep Singh Baines <msb@...gle.com>,
Roland McGrath <roland@...hat.com>,
linux-kernel@...r.kernel.org, stable@...nel.org
Subject: Re: while_each_thread() under rcu_read_lock() is broken?
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> writes:
> On Mon, Jun 21, 2010 at 07:09:19PM +0200, Oleg Nesterov wrote:
>> On 06/18, Paul E. McKenney wrote:
>> >
>> > On Fri, Jun 18, 2010 at 09:34:03PM +0200, Oleg Nesterov wrote:
>> > >
>> > > #define XXX(t) ({
>> > > struct task_struct *__prev = t;
>> > > t = next_thread(t);
>> > > t != g && t != __prev;
>> > > })
>> > >
>> > > #define while_each_thread(g, t) \
>> > > while (XXX(t))
>> >
>> > Isn't the above vulnerable to a pthread_create() immediately following
>> > the offending exec()? Especially if the task doing the traversal is
>> > preempted?
>>
>> Yes, thanks!
>>
>> > here are some techniques that might (or might not) help:
>>
>> To simplify, let's consider the concrete example,
>
> Sounds very good!
>
>> rcu_read_lock();
>>
>> g = t = returns_the_rcu_safe_task_struct_ptr();
>
> This returns a pointer to the task struct of the current thread?
> Or might this return a pointer some other thread's task struct?
>
>> do {
>> printk("%d\n", t->pid);
>> } while_each_thread(g, t);
>>
>> rcu_read_unlock();
>>
>> Whatever we do, without tasklist/siglock this can obviously race
>> with fork/exit/exec. It is OK to miss a thread, or print the pid
>> of the already exited/released task.
>>
>> But it should not loop forever (the subject), and it should not
>> print the same pid twice (ignoring pid reuse, of course).
>>
>> And, afaics, there are no problems with rcu magic per se, next_thread()
>> always returns the task_struct we can safely dereference. The only
>> problem is that while_each_thread() assumes that sooner or later
>> next_thread() must reach the starting point, g.
>>
>> (zap_threads() is different, it must not miss a thread with ->mm
>> we are going to dump, but it holds mmap_sem).
>
> Indeed, the tough part is figuring out when you are done given that things
> can come and go at will. Some additional tricks, in no particular order:
>
> 1. Always start at the group leader. Of course, the group leader
> is probably permitted to leave any time it wants to, so this
> is not sufficient in and of itself.
No. The group leader must exist as long as the group exists.
Modulo de_thread weirdness. The group_leader can be a zombie but
it can not go away completely.
> 2. Maintain a separate task structure that flags the head of the
> list. This separate structure is freed one RCU grace period
> following the disappearance of the current group leader. This
> should be quite robust, but "holy overhead, Batman!!!" (Apologies
> for the American pop culture reference, but nothing else seemed
> appropriate.)
That is roughly what we have in the group leader right now.
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists