[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <201112281955.55200.vda.linux@googlemail.com>
Date: Wed, 28 Dec 2011 19:55:55 +0100
From: Denys Vlasenko <vda.linux@...glemail.com>
To: Tejun Heo <tj@...nel.org>
Cc: Denys Vlasenko <dvlasenk@...hat.com>,
Oleg Nesterov <oleg@...hat.com>, linux-kernel@...r.kernel.org,
Łukasz Michalik <lmi@....uni.wroc.pl>,
"Dmitry V. Levin" <ldv@...linux.org>
Subject: Possible bug introduced in commit 9b84cca
Hi Tejun, Oleg,
Apologies if you are already informed about this bug
by people who originally discovered it.
Looks like after commit 9b84cca, waitpid under strace
sometimes returns bogus ECHILD while child does exist.
I did not yet confirm that the bug appeared exactly
at this commit - Łukasz says that.
I confirmed that bug exists on kernels 3.1.6 (in Fedora)
and 3.1.0-rc4 (vanilla).
We have a testcase which spawns N threads, each of them
performs an infinite loop "fork, exit in child, waitpid
in parent for the child". When straced, sometimes waitpid
returns ECHILD. In fact, there is no need to run many threads -
I just saw it happening with single thread on 4-CPU machine
when I ran "strace -otestcase1.LOG -f ./testcase1 1".
This machine uses 3.1.0-rc4.
Please find testcase attached.
Also please find testcase1.LOG attached.
The key part is here:
931 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xf763dbd8) = 1048
1048 exit_group(42) = ?
931 waitpid(1048, <unfinished ...>
1048 +++ exited with 42 +++
931 <... waitpid resumed> 0xf763d3a0, 0) = -1 ECHILD (No child processes)
To complicate matters, this is observed only under development
version of strace. Old (released) versions of strace do not
let ptraced processes to die - they detach from them when
they think they are going to die (such as when they enter _exit()
or receive a "deadly" signal). Which is a aesthetically horrible and
logically buggy (racy) hack, so we are removing it from strace.
Łukasz says that old strace versions (ones which still use the hack)
don't trigger the bug.
For testing, I will send you strace source tree and pre-compiled
strace binary in a separate email. Alternatively, pull latest
strace git and "autoreconf -fvi && ./configure && make" it.
--
vda
View attachment "testcase1.c" of type "text/x-csrc" (948 bytes)
Download attachment "testcase1.LOG.bz2" of type "application/x-bzip2" (3172 bytes)
Powered by blists - more mailing lists