[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150314192940.GD22130@thin>
Date: Sat, 14 Mar 2015 12:29:40 -0700
From: Josh Triplett <josh@...htriplett.org>
To: Thiago Macieira <thiago.macieira@...el.com>
Cc: Andy Lutomirski <luto@...capital.net>,
David Drysdale <drysdale@...gle.com>,
Al Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...hat.com>,
Kees Cook <keescook@...omium.org>,
Oleg Nesterov <oleg@...hat.com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
"H. Peter Anvin" <hpa@...or.com>, Rik van Riel <riel@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Michael Kerrisk <mtk.manpages@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Linux API <linux-api@...r.kernel.org>,
Linux FS Devel <linux-fsdevel@...r.kernel.org>,
X86 ML <x86@...nel.org>
Subject: Re: [PATCH 0/6] CLONE_FD: Task exit notification via file descriptor
On Sat, Mar 14, 2015 at 12:03:12PM -0700, Thiago Macieira wrote:
> On Friday 13 March 2015 18:11:32 Thiago Macieira wrote:
> > On Friday 13 March 2015 14:51:47 Andy Lutomirski wrote:
> > > In any event, we should find out what FreeBSD does in response to
> > > read(2) on the fd.
> >
> > I've just successfully installed FreeBSD and compiled qtbase (main package
> > of Qt 5) on it.
> >
> > I'll test pdfork during the weekend and report its behaviour.
>
> Here are my findings about pdfork.
>
> Source: http://fxr.watson.org/fxr/source/kern/sys_procdesc.c?v=FREEBSD10
> Qt adaptations: https://codereview.qt-project.org/108561
>
> Processes created with pdfork() are normal processes that still send SIGCHLD
> to their parents. The only difference is that you get the extra file descriptor
> that can be passed to the pdgetpid() system call and works on select()/poll().
> Trying to read from that file descriptor will result in EOPNOTSUPP.
OK, since read() doesn't work on a pdfork() file descriptor, we don't
have to worry about compatibility with pdfork()'s read result.
However, if the expectation is that pdfork()ed child processes still
send SIGCHLD, then I don't see how we can be compatible there, nor do I
think we want to; as you mention below, that breaks the ability to
encapsulate management of the created process entirely within a library.
> Since they've never implemented pdwait4() (it's not even declared in the
> headers), the only way to reap a child if you only have the file descriptor is
> to first pdgetpid() and then call wait4() or wait6().
Which suggests that we shouldn't try to implement pdwait4() in glibc
until FreeBSD implements it in their kernel, since we won't know the
exact semantics they expect.
> If you don't pass PD_DAEMON, the child process gets killed with SIGKILL when
> the file closes.
OK, that makes sense. We could certainly implement a
CLONE_FD_KILL_ON_CLOSE flag with those semantics, if we want one in the
future.
> Conclusion:
> Pros: this is the bare minimum that we'd need to disentangle the SIGCHLD mess.
> As long as all child process activations use this feature, the problem is
> solved.
>
> Cons: it requires cooperation from all child starters. If some other library
> or the application installs a global SIGCHLD handler that waits on all child
> processes, like libvlc used to do and Glib and Ecore still do, you won't be
> able to get the child exit status.
>
> I have not tested what happens if you try to pass the file descriptor to other
> processes (can you even do that on FreeBSD?). But even if you could and got
> notifications, you couldn't wait on the child to get its exit status -- unless
> they implement pdwait4.
Even if they do implement pdwait4, they might not bypass the "must be
the parent process" restriction. Let's wait to see what semantics they
go with.
> - pdfork: can be emulated with clone4 + CLONE_FD (+ CLONEFD_KILL_ON_CLOSE)
> - pdwait4: can be emulated with read()
> - pdgetpid: needs an ioctl
> - pdkill: needs an ioctl [or just write()]
I think that should be a dedicated syscall, not an ioctl.
It's unfortunate that rt_sigqueueinfo doesn't take a flags argument.
However, I just realized that it takes a 32-bit "int" for the signal
number, yet signal numbers fit in 8 bits. So we could just add flags in
the high 24 bits of that argument, and in particular add a flag
indicating that the first argument is a file descriptor rather than a
PID.
- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists