[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090110222434.GA24414@redhat.com>
Date: Sat, 10 Jan 2009 23:24:34 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Scott James Remnant <scott@...onical.com>
Cc: Roland McGrath <roland@...hat.com>, Ingo Molnar <mingo@...e.hu>,
Casey Dahlin <cdahlin@...hat.com>,
Linux Kernel <linux-kernel@...r.kernel.org>,
Randy Dunlap <randy.dunlap@...cle.com>,
Davide Libenzi <davidel@...ilserver.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [RESEND][RFC PATCH v2] waitfd
On 01/10, Scott James Remnant wrote:
>
> On Sat, 2009-01-10 at 19:13 +0100, Oleg Nesterov wrote:
>
> > I never argued with this. And, let me repeat. I am not arguing against
> > waitfd! Actually, I always try to avoid the "do we need this feature"
> > discussions.
> >
> Unless I'm misinterpreting you, you're saying that you don't understand
> why we should change any current behaviour? My post is attempting to
> illustrate why we should.
Scott. How many times should I repeat: I am _not_ arguing against
waitfd.
But to clarify, neither I vote for it. I don't really care. Except
I do care about the code if it will be merged, that is why I entered
this thread.
> > What I disagree with is that waitfd adds the functionality which does
> > not exists currently.
> >
> I'm not saying that it doesn't at all; in fact I gave an example of how
> you implement the exact same functionality today.
This means I was confused. Because I thought you point is we can't poll
for childs without signalfd. And all I asked was: why do you think so.
I do understand that waitfd can be handy.
> In fact, because main loops use select()/poll(), for the SIGCHLD case
> you'd never use signalfd() at all!
>
> Unless I'm missing something, the following two examples are identical
> in behaviour:
>
> using signalfd:
> ...
> using pselect:
Yes, and that is why I mentioned that ppoll() alone is enough.
> But the pselect() version is neater. Which is why I started the
> previous reply off with "why have signalfd() at all?"
Unlike waitfd, there are things which we just can not do without signalfd,
even if we have ppol/pselect. For example: wait for the signal, but not
dequeue it.
> One of them was attempting to explain what you don't understand here,
> I'll try and be more verbose...
> ...
> ~~Calling waitpid() does not clear the pending signal.~~
>
> This is the important bit.
>
> If a further process dies while we're inside the waitpid() loop, we will
> most likely reap that straight away. But this does not clear the
> pending signal. The main loop will be woken up again, even though it
> does not need to be.
>
> Thus:
>
> - child process #1 dies
> - main loop woken up by SIGCHLD
> - pending status of signal cleared
> - enter wait loop
> - child process #2 dies
> - SIGCHLD pending again
> - waitpid() called first time, child process #1 reaped
> - waitpid() called second time, child process #2 reaped
> (SIGCHLD still pending)
> - waitpid() called third time, no child processes remain
> - exit wait loop
> - back to top of main loop, immediately woken up by pending SIGCHLD
> - pending status of signal cleared
> - enter wait loop
> - waitpid() called first time, but no child processes remain
> (we reaped it last time round)
> - exit wait loop
> - back to top of main loop, sleep
Scott, I don't really understand why are you trying to explain this
all to me. I do understand this. At least I hope ;)
Yes this is possible, and I see no problems here.
> - SIGCHLD not pending, but waitpid() will not block
>
> This is true in all example usage; after you've called the read() on
> the signalfd - or the pselect() has woken, SIGCHLD is probably no
> longer pending but waitpid() will not block
>
> Compare with select() behaviour; if you fail to read() from the fd,
> select() wakes up yet again
>
> - SIGCHLD pending, but waitpid() will block
>
> This is true if you exhaust the wait queue in a loop,
... and this too.
> All SIGCHLD is useful for is to get your main loop out of
> select()/poll(); you must always exhaust the wait queue every time you
> have woken up.
Yes, and yes, and yes. Scott, I am sorry, I failed to read to the end
so perhaps I missed something ;)
> --- kernel/signal.c~ 2009-01-10 20:04:50.000000000 +0000
> +++ kernel/signal.c 2009-01-10 20:05:24.000000000 +0000
> @@ -816,8 +816,10 @@
> * exactly one non-rt signal, so that we can get more
> * detailed information about the cause of the signal.
> */
> - if (legacy_queue(pending, sig))
> + if (legacy_queue(pending, sig)) {
> + signalfd_notify(t, sig);
> return 0;
> + }
I'd prefer to not discuss this here, but I am not sure I understand.
There should not be no threads which need the wakeup from here, and
I can't see how this change can help.
> A more orthogonal example would be pselect(). That implemented, in the
> kernel, a syscall that it actually wasn't possible to implement in
> userspace
Yes, exactly,
> The argument for waitfd() or similar in the kernel is because there are
> races in userspace that we can't solve.
And now I don't understand you again. Please show me which races we
_can not_ solve in userspace without waitfd?
Yes we can race with the exiting childs while doing waitpid() in a loop,
so we can make the unnecessary syscall. But please do not tell me _this_
is the race we can't solve. This is _harmless_. Unlike the problems
with the poor user-space implementations of pselect/ppol.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists