[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C2620F4.7030900@colorfullife.com>
Date: Sat, 26 Jun 2010 17:47:00 +0200
From: Manfred Spraul <manfred@...orfullife.com>
To: Luca Tettamanti <kronos.it@...il.com>
CC: Christoph Lameter <cl@...ux-foundation.org>,
linux-kernel@...r.kernel.org, Julia Lawall <julia@...u.dk>,
Andrew Morton <akpm@...ux-foundation.org>,
maciej.rutecki@...il.com
Subject: 2.6.35-rc3: System unresponsive under load
Hi Luca,
On 06/26/2010 02:52 PM, Luca Tettamanti wrote:
> They don't seem really hung as before, I see two different behaviours:
> * Near the end of the run ab is frozen for a few seconds, but in the
> end all requests are processed; however I see a few "length" errors,
> meaning that the received page does not match the expected content
> (I'm testing a static page):
>
>
That's consistent with what I see:
If I run:
#./semtimedop 100 100&
#./semtimedop 100 100&
#./semtimedop 100 100&
#./semtimedop 100 100&
(i.e.: 4 times the attached test app concurrently), then the system
sometimes locks up for 10..20 seconds:
The keyboard is unresponsive, not even the numlock key is processed
(i.e.: the LED does not change anymore).
After 10 or 20 seconds, the keyboard reacts again (both to <enter> and
to Num Lock)
The stock Fedora 13 kernel (2.6.33.5) does not exhibit this behavior
The load average is 300 or so, that's expected.
I have no idea why and how to debug the behavior.
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
> strace on apache shows:
> [pid 3787] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3789] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3788] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3784] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3783] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3782] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3239] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3233] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3238] restart_syscall(<... resuming interrupted call ...> <unfinished ...>
> [pid 3237] restart_syscall(<... resuming interrupted call ...>
>
That can't be semop:
sysv ipc and msg are among the (broken) parts of the kernel that do not
honor SA_RESTART.
--
Manfred
View attachment "semtimedop.cpp" of type "text/plain" (3320 bytes)
Powered by blists - more mailing lists