lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A4BD384.3090407@anarazel.de>
Date:	Wed, 01 Jul 2009 23:22:12 +0200
From:	Andres Freund <andres@...razel.de>
To:	Jarek Poplawski <jarkao2@...il.com>
CC:	LKML <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
	Stephen Hemminger <shemminger@...tta.com>,
	Patrick McHardy <kaber@...sh.net>
Subject: Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 (possibly caused
    by netem)

Hi,

On 07/01/2009 08:39 PM, Jarek Poplawski wrote:
> Andres Freund wrote, On 07/01/2009 01:20 AM:
>> While playing around with netem (time, not packet count based loss-
>> bursts) I experienced soft lockups several times - to exclude it was my
>> modifications causing this I recompiled with the original and it is
>> still locking up.
>> I captured several of those traces via the thankfully
>> still working netconsole.
>> The simplest policy I could reproduce the error with was:
>> tc qdisc add dev eth0 root handle 1: netem delay 10ms loss 0
>>
>> I could not reproduce the error without delay - but that may only be a
>> timing issue, as the host I was mainly transferring data to was on a
>> local network.
>> I could not reproduce the issue on lo.
>>
>> The time to reproduce the error varied from seconds after executing tc
>> to several minutes.
>>
>> Traces 5+6 are made with vanilla 52989765629e7d182b4f146050ebba0abf2cb0b7
>>
>> The earlier traces are made with parts of my patches applied, and only
>> included for completeness as I don't believe my modifications were
>> causing this and all traces are different, so it may give some clues.
>>
>> Lockdep was enabled but did not diagnose anything relevant (one dvb
>> warning during bootup).
>>
>> Any ideas for debugging?
>
> Maybe these traces will be enough, but lockdep report could save time.
> If dvb warning triggers every time then lockdep probably turns off
> just after (it works this way, unless something was changed). So,
> could you try to repeat this without dvb? Btw., did you try this on
> some earlier kernel?
Yes. Today I could not manage to reproduce it on 2.6.30 but could on 
current git...

I *think* I could also provoke the same issue on lo, but I am not 
completely sure, as the host I was redirecting netconsole to 
unfortunately was not up, so I could not check if it was a similar trace.
It could also have been triggered by some random traffic on eth0... Hard 
to say.

Will try without dvb.

Andres
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ