[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100322102247.GA8363@redhat.com>
Date: Mon, 22 Mar 2010 11:22:47 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Ben Blum <bblum@...rew.cmu.edu>
Cc: Paul Menage <menage@...gle.com>, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, ebiederm@...ssion.com,
lizf@...fujitsu.com, matthltc@...ibm.com,
containers@...ts.linux-foundation.org
Subject: Re: [RFC] [PATCH 1/2] cgroups: read-write lock CLONE_THREAD
forking per threadgroup
On 01/17, Ben Blum wrote:
>
> On Tue, Jan 05, 2010 at 07:53:30PM +0100, Oleg Nesterov wrote:
> >
> > I don't understand how this can close the race with de_thread().
> > ...
>
> the race with the sighand is handled in the next patch, in attach_proc,
> not in this function.
OK. I didn't verify this, the patches don't apply to 2.6.32-rc, but this
doesn't matter. Please see below.
> > > + /* now try to find a sighand */
> > > + if (likely(tsk->sighand)) {
> > > + sighand = tsk->sighand;
> > > + } else {
> > > + sighand = ERR_PTR(-ESRCH);
> > > + /*
> > > + * tsk is exiting; try to find another thread in the group
> > > + * whose sighand pointer is still alive.
> > > + */
> > > + list_for_each_entry_rcu(p, &tsk->thread_group, thread_group) {
> > > + if (p->sighand) {
> > > + sighand = tsk->sighand;
> >
> > can't understand this "else {}" code... We hold tasklist, if the group
> > leader is dead (->sighand == NULL), then the whole thread group is
> > dead.
> >
> > Even if we had another thread with ->sighand != NULL, what is the point
> > of "if (unlikely(!thread_group_leader(tsk)))" check then?
>
> doesn't the group leader stay on the threadgroup list even when it dies?
> sighand can be null if the group leader has exited, but other threads
> are still running.
No, leader->sighand != NULL until all threads have exited.
Ben, I'd suggest you to redo these patches even if they are correct.
->sighand is not the right place for the mutex/locking
- it is per CLONE_SIGHAND, not per process
- we have to avoid the nasty and hard-to-test races with exec
- we have to play with sighand->count and I really dislike this.
this ->count is not just a reference counter, look at
unshare_sighand(). Yes, this is fake, but still.
Please use ->signal instead. By the lucky coincidence the lifetime rules
for (greatly misnamed) signal_struct were changed recently in -mm.
With the recent changes, it is always safe to use task->signal. It can't
be changed, can't go away, no need to bump the counter, no races, etc.
What do you think?
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists