linux-kernel - Re: [RFC] process wide itimer cruft

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1233683499.10184.45.camel@laptop>
Date:	Tue, 03 Feb 2009 18:51:39 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	"Zhang, Yanmin" <yanmin_zhang@...ux.intel.com>,
	Lin Ming <ming.m.lin@...el.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [RFC] process wide itimer cruft

On Tue, 2009-02-03 at 18:23 +0100, Oleg Nesterov wrote:
> On 02/03, Peter Zijlstra wrote:
> >
> > On Mon, 2009-02-02 at 09:53 +0100, Peter Zijlstra wrote:
> >
> > I'm punting the sum-all-threads work off to a workqueue,
> 
> I don't really understand how this works, but I didn't try to read
> this part carefully. For example, when we call thread_group_cputime()
> we don't really get the "group" statistics immediately? But this looks
> very interesting anyway.

Because our thread group can be extremely large and take longer than a
jiffy to sum up -- this is the situation that started all this itimer
tinkering.

However, Ingo spoke to me on IRC and suggested another approach, which
I'm currently working on -- hopefully done tomorrow.

> > The remaining option is to make signal struct itself rcu freed, but
> > before I do that, I thought I'd run this code by some folks.
> 
> I think we should follow the Ingo's suggestion: we should make ->signal
> refcountable, we should never clear task->signal, it should be freed
> by __put_task_struct()'s path.

Right, that'd make a lot of sense.

> In fact I was going to make this patches the previous week, will try
> to do this week. But we need another counter for that, we can't use
> signal->count.

I'm not quite sure I understand all that code quite yet, although I've
been staring at it for the past day or so.

->live  -- the number of associated tasks,
->count -- not quite a refcount?

I can see adding a 3rd counter for reference counting could solve
things, but can we start by clarifying the exact semantics of these two?
If only for future readers..

> This blows signal_struct a bit, but otoh with this change we can
> move some fields (for example, ->group_leader) to signal_struct.
> And we can do many simplifications. Just for example, __sched_setscheduler()
> takes ->siglock just to read signal->rlim[].

Could you shed a bit of light on the distinction between sighand and
signal?

> > @@ -96,14 +105,16 @@ static void __exit_signal(struct task_struct *tsk)
> >  	spin_lock(&sighand->siglock);
> >
> >  	posix_cpu_timers_exit(tsk);
> > -	if (atomic_dec_and_test(&sig->count))
> > +	if (!atomic_read(&sig->live)) {
> >  		posix_cpu_timers_exit_group(tsk);
> 
> This doesn't look exactly right, but I don't see the "real" problems
> with this change.
> 
> We can have a lot of threads which didn't even pass exit_notify(),
> another process can attach the cpu timer to us once we drop the
> locks. OK, no real problems afaics, because each sub-thread will
> in turn do posix_cpu_timers_exit_group() later.

Yeah, you can get multiple invocations of the
posix_cpu_timers_exit_group() stuff, and less summing if dead tasks, the
latter might be an issue.

> But this looks a bit too early. It is better to continue to account
> these threads, they can consume a lot of cpu. Anyway, this very
> minor issue.

Agreed.

> > -	else {
> > +		sig->curr_target = NULL;
> 
> complete_signal() can crash if it hits ->curr_target = NULL, and
> we are still "visible" to signals even if sig->live == 0.

Ooh, missed that. Good catch indeed.

> > +	} else {
> >  		/*
> >  		 * If there is any task waiting for the group exit
> >  		 * then notify it:
> >  		 */
> > -		if (sig->group_exit_task && atomic_read(&sig->count) == sig->notify_count)
> > +		if (sig->group_exit_task &&
> > +				atomic_read(&sig->live) == sig->notify_count)
> 
> This looks wrong. de_thread() can hang forever, put_signal() doesn't
> wake up ->group_exit_task.
> 
> I think we really need another counter, at least for now.

Don't rush on my account, Ingo's proposed solution doesn't need this. 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/