lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A4A9DD6.8060800@anarazel.de>
Date:	Wed, 01 Jul 2009 01:20:54 +0200
From:	Andres Freund <andres@...razel.de>
To:	LKML <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
	Jarek Poplawski <jarkao2@...il.com>,
	Stephen Hemminger <shemminger@...tta.com>,
	Patrick McHardy <kaber@...sh.net>
Subject: Soft-Lockup/Race in networking in 2.6.31-rc1+195 (possibly caused
  by netem)

Hi,

While playing around with netem (time, not packet count based loss-
bursts) I experienced soft lockups several times - to exclude it was my
modifications causing this I recompiled with the original and it is
still locking up.
I captured several of those traces via the thankfully
still working netconsole.
The simplest policy I could reproduce the error with was:
tc qdisc add dev eth0 root handle 1: netem delay 10ms loss 0

I could not reproduce the error without delay - but that may only be a
timing issue, as the host I was mainly transferring data to was on a
local network.
I could not reproduce the issue on lo.

The time to reproduce the error varied from seconds after executing tc 
to several minutes.

Traces 5+6 are made with vanilla 52989765629e7d182b4f146050ebba0abf2cb0b7

The earlier traces are made with parts of my patches applied, and only 
included for completeness as I don't believe my modifications were 
causing this and all traces are different, so it may give some clues.

Lockdep was enabled but did not diagnose anything relevant (one dvb 
warning during bootup).

Any ideas for debugging?


Andres


PS: I also could reproduce the issue without netconsole but in vain of a 
serial console could not capture a trace.

View attachment "config" of type "text/plain" (70501 bytes)

View attachment "debug_out6.txt" of type "text/plain" (12530 bytes)

View attachment "debug_out5.txt" of type "text/plain" (10478 bytes)

View attachment "debug_out1.txt" of type "text/plain" (3127 bytes)

View attachment "debug_out2.txt" of type "text/plain" (8408 bytes)

View attachment "debug_out3.txt" of type "text/plain" (9665 bytes)

View attachment "debug_out4.txt" of type "text/plain" (9082 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ