[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100621213820.GJ2354@linux.vnet.ibm.com>
Date: Mon, 21 Jun 2010 14:38:20 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Oleg Nesterov <oleg@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Don Zickus <dzickus@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Ingo Molnar <mingo@...e.hu>,
Jerome Marchand <jmarchan@...hat.com>,
Mandeep Singh Baines <msb@...gle.com>,
Roland McGrath <roland@...hat.com>,
linux-kernel@...r.kernel.org, stable@...nel.org
Subject: Re: while_each_thread() under rcu_read_lock() is broken?
On Mon, Jun 21, 2010 at 02:22:59PM -0700, Eric W. Biederman wrote:
> "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> writes:
>
> > On Mon, Jun 21, 2010 at 07:09:19PM +0200, Oleg Nesterov wrote:
> >> On 06/18, Paul E. McKenney wrote:
> >> >
> >> > On Fri, Jun 18, 2010 at 09:34:03PM +0200, Oleg Nesterov wrote:
> >> > >
> >> > > #define XXX(t) ({
> >> > > struct task_struct *__prev = t;
> >> > > t = next_thread(t);
> >> > > t != g && t != __prev;
> >> > > })
> >> > >
> >> > > #define while_each_thread(g, t) \
> >> > > while (XXX(t))
> >> >
> >> > Isn't the above vulnerable to a pthread_create() immediately following
> >> > the offending exec()? Especially if the task doing the traversal is
> >> > preempted?
> >>
> >> Yes, thanks!
> >>
> >> > here are some techniques that might (or might not) help:
> >>
> >> To simplify, let's consider the concrete example,
> >
> > Sounds very good!
> >
> >> rcu_read_lock();
> >>
> >> g = t = returns_the_rcu_safe_task_struct_ptr();
> >
> > This returns a pointer to the task struct of the current thread?
> > Or might this return a pointer some other thread's task struct?
> >
> >> do {
> >> printk("%d\n", t->pid);
> >> } while_each_thread(g, t);
> >>
> >> rcu_read_unlock();
> >>
> >> Whatever we do, without tasklist/siglock this can obviously race
> >> with fork/exit/exec. It is OK to miss a thread, or print the pid
> >> of the already exited/released task.
> >>
> >> But it should not loop forever (the subject), and it should not
> >> print the same pid twice (ignoring pid reuse, of course).
> >>
> >> And, afaics, there are no problems with rcu magic per se, next_thread()
> >> always returns the task_struct we can safely dereference. The only
> >> problem is that while_each_thread() assumes that sooner or later
> >> next_thread() must reach the starting point, g.
> >>
> >> (zap_threads() is different, it must not miss a thread with ->mm
> >> we are going to dump, but it holds mmap_sem).
> >
> > Indeed, the tough part is figuring out when you are done given that things
> > can come and go at will. Some additional tricks, in no particular order:
> >
> > 1. Always start at the group leader. Of course, the group leader
> > is probably permitted to leave any time it wants to, so this
> > is not sufficient in and of itself.
>
> No. The group leader must exist as long as the group exists.
> Modulo de_thread weirdness. The group_leader can be a zombie but
> it can not go away completely.
Ah, OK -- now that you mention it, all the thinks that I can think of
that remove a thread from a group have the side effect of destroying
the old group (exec() and exit()). Other things that create a new thread
group leave the old thread group intact.
Or am I forgetting some odd operation?
> > 2. Maintain a separate task structure that flags the head of the
> > list. This separate structure is freed one RCU grace period
> > following the disappearance of the current group leader. This
> > should be quite robust, but "holy overhead, Batman!!!" (Apologies
> > for the American pop culture reference, but nothing else seemed
> > appropriate.)
>
> That is roughly what we have in the group leader right now.
But can't the group leader do an exec(), becoming the leader of a new
thread group without waiting for a grace period? Or this possibility
already covered?
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists