[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44178008-720e-0858-c9e8-d23d2087dcc7@suse.com>
Date: Thu, 22 Sep 2016 18:20:49 +1000
From: Aleksa Sarai <asarai@...e.com>
To: Michal Hocko <mhocko@...nel.org>,
Mike Galbraith <umgwanakikbuti@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
strace-devel@...ts.sourceforge.net, Oleg Nesterov <oleg@...hat.com>
Subject: Re: strace lockup when tracing exec in go
> So I've stared into do_notify_parent some more and the following was
> just very confusing
>
> if (!tsk->ptrace && sig == SIGCHLD &&
> (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN ||
> (psig->action[SIGCHLD-1].sa.sa_flags & SA_NOCLDWAIT))) {
> /*
> * We are exiting and our parent doesn't care. POSIX.1
> * defines special semantics for setting SIGCHLD to SIG_IGN
> * or setting the SA_NOCLDWAIT flag: we should be reaped
> * automatically and not left for our parent's wait4 call.
> * Rather than having the parent do it as a magic kind of
> * signal handler, we just set this to tell do_exit that we
> * can be cleaned up without becoming a zombie. Note that
> * we still call __wake_up_parent in this case, because a
> * blocked sys_wait4 might now return -ECHILD.
> *
> * Whether we send SIGCHLD or not for SA_NOCLDWAIT
> * is implementation-defined: we do (if you don't want
> * it, just use SIG_IGN instead).
> */
> autoreap = true;
> if (psig->action[SIGCHLD-1].sa.sa_handler == SIG_IGN)
> sig = 0;
> }
>
> it tries to prevent from what I am seeing in a way. If the SIGCHLD is
> ignored then it just does autoreap and everything is fine. But this
> doesn't seem to be the case here. In fact we are not sending the signal
> because sig_task_ignored is true resp. sig_handler_ignored which can
> fail even for handler == SIG_DFL && sig_kernel_ignore() and SIGCHLD
> seems to be in SIG_KERNEL_IGNORE_MASK. So I've tried
I was looking at the same code this morning. I thought maybe we should
drop the !tsk->ptrace condition (or make it so that the condition still
succeeds if the tracer also happens to be tsk->real_parent) -- since
this is only happening when the process is being traced? I tried this
and the issue still persists, but I didn't apply your other proposed
change to this conditional.
Or am I misunderstanding what tsk->ptrace refers to?
--
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/
Powered by blists - more mailing lists