lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4C2620F4.7030900@colorfullife.com>
Date:	Sat, 26 Jun 2010 17:47:00 +0200
From:	Manfred Spraul <manfred@...orfullife.com>
To:	Luca Tettamanti <kronos.it@...il.com>
CC:	Christoph Lameter <cl@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, Julia Lawall <julia@...u.dk>,
	Andrew Morton <akpm@...ux-foundation.org>,
	maciej.rutecki@...il.com
Subject: 2.6.35-rc3: System unresponsive under load

Hi Luca,

On 06/26/2010 02:52 PM, Luca Tettamanti wrote:
> They don't seem really hung as before, I see two different behaviours:
> * Near the end of the run ab is frozen for a few seconds, but in the
> end all requests are processed; however I see a few "length" errors,
> meaning that the received page does not match the expected content
> (I'm testing a static page):
>
>    
That's consistent with what I see:
If I run:
#./semtimedop 100 100&
#./semtimedop 100 100&
#./semtimedop 100 100&
#./semtimedop 100 100&

(i.e.: 4 times the attached test app concurrently), then the system 
sometimes locks up for 10..20 seconds:
The keyboard is unresponsive, not even the numlock key is processed 
(i.e.: the LED does not change anymore).
After 10 or 20 seconds, the keyboard reacts again (both to <enter> and 
to Num Lock)
The stock Fedora 13 kernel (2.6.33.5) does not exhibit this behavior
The load average is 300 or so, that's expected.

I have no idea why and how to debug the behavior.
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
> strace on apache shows:
> [pid  3787] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3789] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3788] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3784] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3783] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3782] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3239] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3233] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3238] restart_syscall(<... resuming interrupted call ...>  <unfinished ...>
> [pid  3237] restart_syscall(<... resuming interrupted call ...>
>    

That can't be semop:
sysv ipc and msg are among the (broken) parts of the kernel that do not 
honor SA_RESTART.

--
     Manfred



View attachment "semtimedop.cpp" of type "text/plain" (3320 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ