linux-kernel - Re: [PATCH 1/8] job control: Don't set group_stop exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110321132024.GA18777@redhat.com>
Date:	Mon, 21 Mar 2011 14:20:24 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Tejun Heo <tj@...nel.org>
Cc:	roland@...hat.com, jan.kratochvil@...hat.com,
	vda.linux@...glemail.com, linux-kernel@...r.kernel.org,
	torvalds@...ux-foundation.org, akpm@...ux-foundation.org,
	indan@....nu
Subject: Re: [PATCH 1/8] job control: Don't set group_stop exit_code if
	re-entering job control stop

On 03/08, Tejun Heo wrote:
>

Hi Tejun,

I hope you still remember you sent these patches, perhaps you can
even recall what they should do ;)

> @@ -1827,10 +1827,27 @@ static int do_signal_stop(int signr)
>  		    unlikely(signal_group_exit(sig)))
>  			return 0;
>  		/*
> -		 * There is no group stop already in progress.
> -		 * We must initiate one now.
> +		 * There is no group stop already in progress.  We must
> +		 * initiate one now.
> +		 *
> +		 * While ptraced, a task may be resumed while group stop is
> +		 * still in effect and then receive a stop signal and
> +		 * initiate another group stop.  This deviates from the
> +		 * usual behavior as two consecutive stop signals can't
> +		 * cause two group stops when !ptraced.
> +		 *
> +		 * The condition can be distinguished by testing whether
> +		 * SIGNAL_STOP_STOPPED is already set.  Don't generate
> +		 * group_exit_code in such case.
> +		 *
> +		 * This is not necessary for SIGNAL_STOP_CONTINUED because
> +		 * an intervening stop signal is required to cause two
> +		 * continued events regardless of ptrace.
>  		 */
> -		sig->group_exit_code = signr;
> +		if (!(sig->flags & SIGNAL_STOP_STOPPED))
> +			sig->group_exit_code = signr;
> +		else
> +			WARN_ON_ONCE(!task_ptrace(current));

Yes. But WARN_ON_ONCE() is wrong. Suppose that the tracee was stopped,
then PTRACE_CONT'ed, then it gets another SIGSTOP and reports it. Now
suppose that debugger does PTRACE_CONT(SIGSTOP) and exits before the
tracee processes this signal.

OTOH, this WARN_ON_ONCE() makes sense, but we should fix __ptrace_unlink().
This path should take siglock and check SIGNAL_STOP_STOPPED unconditionally.
This should also fix other problems with detach && SIGNAL_STOP_STOPPED.

Also. We should take ->group_stop_count != 0 into account, we should not
set (change) ->group_exit_code in this case too. This is is only "real"
problem in this patch I can see. Other comments are mostly the random
thoughts.

But lets look at the code below,

		for (t = next_thread(current); t != current;
		     t = next_thread(t)) {
			t->group_stop &= ~GROUP_STOP_SIGMASK;
			/*
			 * Setting state to TASK_STOPPED for a group
			 * stop is always done with the siglock held,
			 * so this check has no races.
			 */
			if (!(t->flags & PF_EXITING) && !task_is_stopped(t)) {
				t->group_stop |= signr | gstop;
				sig->group_stop_count++;
				signal_wake_up(t, 0);
			} else {
				task_clear_group_stop_pending(t);
			}
		}

Somehow I no longer understand "else task_clear_group_stop_pending()".
I mean, is it really needed?

If task_is_stopped() == T or it is PF_EXITING, this task has already
done task_participate_group_stop(), no?

Also. I do not think it is correct to change the "signr" part of
->group_stop (unless it is zero) when ->group_stop_count != 0
for other threads. This is minor, but still doesn't look exactly
correct. Probably we can ignore this.

Hmm. it turns out "group_stop & GROUP_STOP_SIGMASK" is only needed
to handle this special case: if debugger PTRACE_CONT's or more
stopped tracees and then som thread initiates the stop again, other
threads need to know that "signr". Otherwise this part of ->group_stop
is only valid "inside" the retry loop in do_signal_stop(), it can
be a local variable. I wonder if we can simply report SIGSTOP in
this case and kill the GROUP_STOP_SIGMASK logic. Nevermind.

And. I think this code does something we do not really want. Why do
we _always_ ask the task_is_traced() threads to participate?

2 threads T1 and T2, both stopped. they are TASK_TRACED, I mean
SIGNAL_STOP_STOPPED is stopped and both have already participated.

Debuggere PTRACE_CONTs T1, then it calls do_signal_stop() again
and sets T2->group_stop = GROUP_STOP_PENDING | GROUP_STOP_CONSUME.
This T2->group_stop doesn't look right, we can report the wrong
extra CLD_STOPPED after detach while ->group_exit_code can be 0.
I think that !task_ptrace(current) case in do_signal_stop() should
take SIGNAL_STOP_STOPPED into account, but perhaps we need another
GROUP_STOP_REPORTED bit, I am not sure.

Or, if debugger PTRACE_CONT's T2, it will report another
ptrace_stop(CLD_STOPPED) immediately, this differs from the current
behaviour although probably we do not care.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/