lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090110222434.GA24414@redhat.com>
Date:	Sat, 10 Jan 2009 23:24:34 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Scott James Remnant <scott@...onical.com>
Cc:	Roland McGrath <roland@...hat.com>, Ingo Molnar <mingo@...e.hu>,
	Casey Dahlin <cdahlin@...hat.com>,
	Linux Kernel <linux-kernel@...r.kernel.org>,
	Randy Dunlap <randy.dunlap@...cle.com>,
	Davide Libenzi <davidel@...ilserver.org>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [RESEND][RFC PATCH v2] waitfd

On 01/10, Scott James Remnant wrote:
>
> On Sat, 2009-01-10 at 19:13 +0100, Oleg Nesterov wrote:
>
> > I never argued with this. And, let me repeat. I am not arguing against
> > waitfd! Actually, I always try to avoid the "do we need this feature"
> > discussions.
> >
> Unless I'm misinterpreting you, you're saying that you don't understand
> why we should change any current behaviour?  My post is attempting to
> illustrate why we should.

Scott. How many times should I repeat: I am _not_ arguing against
waitfd.

But to clarify, neither I vote for it. I don't really care. Except
I do care about the code if it will be merged, that is why I entered
this thread.

> > What I disagree with is that waitfd adds the functionality which does
> > not exists currently.
> >
> I'm not saying that it doesn't at all; in fact I gave an example of how
> you implement the exact same functionality today.

This means I was confused. Because I thought you point is we can't poll
for childs without signalfd. And all I asked was: why do you think so.
I do understand that waitfd can be handy.

> In fact, because main loops use select()/poll(), for the SIGCHLD case
> you'd never use signalfd() at all!
>
> Unless I'm missing something, the following two examples are identical
> in behaviour:
>
> using signalfd:
> ...
> using pselect:

Yes, and that is why I mentioned that ppoll() alone is enough.

> But the pselect() version is neater.  Which is why I started the
> previous reply off with "why have signalfd() at all?"

Unlike waitfd, there are things which we just can not do without signalfd,
even if we have ppol/pselect. For example: wait for the signal, but not
dequeue it.

> One of them was attempting to explain what you don't understand here,
> I'll try and be more verbose...
> ...
> ~~Calling waitpid() does not clear the pending signal.~~
>
> This is the important bit.
>
> If a further process dies while we're inside the waitpid() loop, we will
> most likely reap that straight away.  But this does not clear the
> pending signal.  The main loop will be woken up again, even though it
> does not need to be.
>
> Thus:
>
>  - child process #1 dies
>  - main loop woken up by SIGCHLD
>  - pending status of signal cleared
>  - enter wait loop
>  - child process #2 dies
>  - SIGCHLD pending again
>  - waitpid() called first time, child process #1 reaped
>  - waitpid() called second time, child process #2 reaped
>    (SIGCHLD still pending)
>  - waitpid() called third time, no child processes remain
>  - exit wait loop
>  - back to top of main loop, immediately woken up by pending SIGCHLD
>  - pending status of signal cleared
>  - enter wait loop
>  - waitpid() called first time, but no child processes remain
>    (we reaped it last time round)
>  - exit wait loop
>  - back to top of main loop, sleep

Scott, I don't really understand why are you trying to explain this
all to me. I do understand this. At least I hope ;)

Yes this is possible, and I see no problems here.

>  - SIGCHLD not pending, but waitpid() will not block
>
>    This is true in all example usage; after you've called the read() on
>    the signalfd - or the pselect() has woken, SIGCHLD is probably no
>    longer pending but waitpid() will not block
>
>    Compare with select() behaviour; if you fail to read() from the fd,
>    select() wakes up yet again
>
>  - SIGCHLD pending, but waitpid() will block
>
>    This is true if you exhaust the wait queue in a loop,

... and this too.

> All SIGCHLD is useful for is to get your main loop out of
> select()/poll(); you must always exhaust the wait queue every time you
> have woken up.

Yes, and yes, and yes. Scott, I am sorry, I failed to read to the end
so perhaps I missed something ;)

> --- kernel/signal.c~	2009-01-10 20:04:50.000000000 +0000
> +++ kernel/signal.c	2009-01-10 20:05:24.000000000 +0000
> @@ -816,8 +816,10 @@
>  	 * exactly one non-rt signal, so that we can get more
>  	 * detailed information about the cause of the signal.
>  	 */
> -	if (legacy_queue(pending, sig))
> +	if (legacy_queue(pending, sig)) {
> +		signalfd_notify(t, sig);
>  		return 0;
> +	}

I'd prefer to not discuss this here, but I am not sure I understand.
There should not be no threads which need the wakeup from here, and
I can't see how this change can help.

> A more orthogonal example would be pselect().  That implemented, in the
> kernel, a syscall that it actually wasn't possible to implement in
> userspace

Yes, exactly,

> The argument for waitfd() or similar in the kernel is because there are
> races in userspace that we can't solve.

And now I don't understand you again. Please show me which races we
_can not_ solve in userspace without waitfd?

Yes we can race with the exiting childs while doing waitpid() in a loop,
so we can make the unnecessary syscall. But please do not tell me _this_
is the race we can't solve. This is _harmless_. Unlike the problems
with the poor user-space implementations of pselect/ppol.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ