linux-kernel - Re: regression introduced by - timers: fix itimer/many thread hang

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20081114164155.GA7738@redhat.com>
Date:	Fri, 14 Nov 2008 17:41:55 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Roland McGrath <roland@...hat.com>
Cc:	Frank Mayhar <fmayhar@...gle.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Doug Chapman <doug.chapman@...com>, mingo@...e.hu,
	adobriyan@...il.com, akpm@...ux-foundation.org,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: regression introduced by - timers: fix itimer/many thread hang

On 11/13, Roland McGrath wrote:
>
> An idea like taking siglock in account_group_*() should be a non-starter.

Yes sure. The patch was buggy anyway, but even _if_ was correct it was
only a temporary hack for 2.6.28.

> A third variety of possible fix that we haven't explored much is to delay
> parts of the teardown to __put_task_struct or to finish_task_switch's
> TASK_DEAD case.  That is, make simpler code on the tick path remain safe
> until it's no longer possible to have a tick (because it's after the final
> deschedule).

This was already discussed a bit,
	http://marc.info/?l=linux-kernel&m=122640473714466

and perhaps this makes sense. With this change we can simplify other code.

> If I'm understanding it correctly, Oleg's task_rq_unlock_wait change makes
> sure that if any task_rq_lock is in effect when clearing ->signal, it's
> effectively serialized either to:
> 	CPU1(tsk)				CPU2(parent)
> 	task_rq_lock(tsk)...task_rq_unlock(tsk)
> 						tsk->signal = NULL;
> 						__cleanup_signal(sig);
> or to:
> 	CPU1(tsk)				CPU2(parent)
> 						tsk->signal = NULL;
> 	task_rq_lock(tsk)...task_rq_unlock(tsk)
> 						__cleanup_signal(sig);
> so that the locked "..." code either sees NULL or sees a signal_struct
> that cannot be passed to __cleanup_signal until after task_rq_unlock.
> Is that right?
>
> Doesn't the same bug exist for account_group_user_time and
> account_group_system_time?  Those aren't called with task_rq_lock(current)
> held, I don't think.  So Oleg's change doesn't address the whole problem,
> unless I'm missing something (which is always likely).

You are right. (please see below).

Even run_posix_cpu_timers() becomes unsafe. And I must admit, I have read
this part of the patch carefully before, and I didn't notice the problem.
I'll try to finally read the whole patch carefully on Sunday, but I don't
trust myself ;)

> The first thing that pops to my mind is to protect the freeing of
> signal_struct and thread_group_cputime_free (i.e. some or most of the
> __cleanup_signal worK) with RCU.  Then use rcu_read_lock() around accesses
> to current->signal in places that can run after exit_notify,

Yes, this was my initial intent, but needs more changes. (actually,
I personally like the idea to free ->signal from __put_task_struct()
more, but I have no good arguments).

Currently I am trying to find the ugly, but simple fixes for 2.6.28.

account_group_user_time(), run_posix_cpu_timers() are simpler to
fix. Again, I need to actually read the code, but afaics we can
rely on the fact that the task is current, so we can change the
code

	-	if (!->signal)
	+	if (->exit_state)
			return;

But of course, I do agree, we need a more clever fix for the long
term, even if the change above can really help.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/