lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0903280031080.5904@wrl-59.cs.helsinki.fi>
Date:	Sat, 28 Mar 2009 01:05:09 +0200 (EET)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Markus Trippelsdorf <markus@...ppelsdorf.de>
cc:	Netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: WARNING: at net/ipv4/tcp_input.c:2927 tcp_ack+0xd55/0x1991()

On Fri, 27 Mar 2009, Markus Trippelsdorf wrote:

> I'm running the latest git kernel (2.6.29-03321-gbe0ea69) and I've got
> this warning twice in the last few hours.:

What did you run previously?

> Mar 27 21:37:00 [kernel] ------------[ cut here ]------------
> Mar 27 21:37:00 [kernel] WARNING: at net/ipv4/tcp_input.c:2927 tcp_ack+0xd55/0x1991()

This one may or may not be a new one... Starting from the point when the 
warning was added it has been seen and some of those miscounts got tracked 
down but there is still something remaining (and that has been the state 
for couple of version already). It seems to require some particularly hard 
to reproduce network behavior people usually hit once in a lifetime. 
However, those miscount alone should not cause crashes, stalled TCP at 
worst but even that is quite unlikely to happen if fackets_out was not 
counted right.

> The machine hangs afterwards.

Is it really related to the warning for sure? I find it hard to 
believe...

We even fixed that miscount for you when the warning was printed out (and 
the miscount alone wouldn't be able to cause crash anyway). Obviously 
there could something that got broken but reading through all post 2.6.29 
tcp material doesn't reveal anything particularly suspicious or even 
tricky... Only one thing that is remotely related to the warning that gets 
printed out is d3d2ae454501a4dec360995649e1b002a2ad90c5 but even that has 
very strong foundation as it does not have any potential to introduce 
stale references, rest of the effects would be just stalled tcp connection 
at worst.

Please add some debugging things, at least lockdep (CONFIG_PROVE_LOCKING) 
and soft lockup detector (CONFIG_DETECT_SOFTLOCKUP) to find out if we can 
get some info about the actual place of hang, some other debug things 
might also end up being useful.

Thanks for the report.

-- 
 i.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ