linux-kernel - Re: + exitc-call-proc_exit_connector-after-exit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140227144826.GA13313@bender.morinfr.org>
Date:	Thu, 27 Feb 2014 15:48:27 +0100
From:	Guillaume Morin <guillaume@...infr.org>
To:	Oleg Nesterov <oleg@...hat.com>
Cc:	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
	matt.helsley@...il.com, davem@...emloft.net, guillaume@...infr.org
Subject: Re: + exitc-call-proc_exit_connector-after-exit_state-is-set.patch
 added to -mm tree

On 25 Feb 16:10, Oleg Nesterov wrote:
> > pid_t pid = fork();
> > if (pid > 0) {
> > 	register_interest_for_pid(pid);
> > 	if (waitpid(pid, NULL, WNOHANG) > 0)
> > 	{
> > 	  /* We might have raced with exit() */
> > 	}
> 
> Just in case... Even with this patch the code above is still "racy" if the
> child is multi-threaded. Plus it should obviously filter-out subthreads.
> And afaics there is no way to make it reliable, even if you change the
> code above so that waitpid() is called only after the last thread exits
> WNOHANG still can fail.
> Not that I am not arguing with this change. Although I hope that someone
> can confirm that netlink_broadcast() is safe even if release_task(current)
> was already called, so that the caller has no pids, sighand, is not visible
> via /proc/, etc.

I was too succinct, I think.  What I am trying to do is to close a race
when a short-lived *process* dies before register_interest_for_pid()
interprets the connector message correctly, (i.e realizes this is an
exit message for a pid that the parent created).

For example, let's say that the parent has an independent thread that
just reads from the netlink socket or uses a BPF filter to see only the
events it cares about.  In that case, it's possible that the exit
connector message will be discarded (either by a reader thread or the
BPF filter) before the parent realizes it should care about messages
about a new pid (the child pid)

You clarified for me that a ptraced process is a case where this race
could still happen.  That's a good point.  Fortunately, in the case of a
short-lived process, this is not a common scenario.

If we ignore the ptrace() case, I am not sure I see the problem with
multithreaded processes.  Even if the main thread exits right away, what is
important is that:
- *either* the exit connector message of the last thread that dies is be
  seen after register_interest_for_pid completes
- *or* that waitpid(WNOHANG) succeeds right after
  register_interest_for_pid()

You seem to say it's possible for all threads to have completed
exit_notify() and sent their exit message to the connector before
register_interest_for_pid() does its job and still have waitpid(WNOHANG)
fails.  Is it correct?  If so, could you give a bit more details on how
this could happen?

My understanding is that if all threads exited before waitpid() is
called, exit->state will be set to EXIT_ZOMBIE for the pid and that
delay_group_leader() will be false (because all sub-threads have
exited), so that waitpid(WNOHANG) will successfully reap the process.
What am I missing?

Guillaume.

-- 
Guillaume Morin <guillaume@...infr.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/