lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 30 Mar 2009 21:52:55 +0300 (EEST)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Markus Trippelsdorf <markus@...ppelsdorf.de>
cc:	Netdev <netdev@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>, corbet@....net
Subject: Re: WARNING: at net/ipv4/tcp_input.c:2927 tcp_ack+0xd55/0x1991()

On Mon, 30 Mar 2009, Markus Trippelsdorf wrote:

> On Mon, Mar 30, 2009 at 07:01:22PM +0300, Ilpo Järvinen wrote:
> > On Sat, 28 Mar 2009, Markus Trippelsdorf wrote:
> > > On Sat, Mar 28, 2009 at 10:29:58AM +0200, Ilpo Järvinen wrote:
> > > > On Sat, 28 Mar 2009, Markus Trippelsdorf wrote:
> > > > > On Sat, Mar 28, 2009 at 01:05:09AM +0200, Ilpo Järvinen wrote:
> > > > > > On Fri, 27 Mar 2009, Markus Trippelsdorf wrote:
> > > > > > 
> > > > > > > I'm running the latest git kernel (2.6.29-03321-gbe0ea69) and I've got
> > > > > > > this warning twice in the last few hours.:
> > 
> > > > > > > The machine hangs afterwards.
> > > > > > 
> > > > > > Is it really related to the warning for sure? I find it hard to 
> > > > > > believe...
> > > > > 
> > > > > The machine is normally running stable for days. Switching back to 2.6.29
> > > > > solves the problem...
> > > > 
> > > > Sure, but does is hang right after printing that warning or much later on,
> > > > e.g., one minute is already a very long time for the crash to be related 
> > > > to that warning... Even 5 seconds is a long time but I'd immediately say 
> > > > it's not related then :-).
> > > 
> > > I really can't tell you. In both occurrences of the warning the machine
> > > was already unusable when I noticed. I then rebooted and the last entry
> > > in the logs was that warning.
> > 
> > ...And, let me guess, you're in X and therefore unable to catch a final 
> > oops if any would be printed? It would be nice to get around that as well, 
> > either use serial/netconsole or hang in text mode while waiting for the 
> > crash (should be too hard if you are able to setup the workload first 
> > and then switch away from X and if reproducing takes about an hour)...
> 
> OK, I will try this later.

Lets hope that gives some clue where it ends up going boom (if it is 
caused by TCP we certainly should see something more sensible in console 
than just a hang)... ...I once again read through tcp commits but just 
cannot find anything that could cause fackets_out miscount, not to speak 
of crash prone changes so we'll just have to wait and see...

> > Arguably the presence of that warning at both times is somewhat alarming.
> 
> Make that four times:
> 
> kernel # for i in log*; cat $i | grep "tcp_input";
> Mar 27 19:57:40 [kernel] WARNING: at net/ipv4/tcp_input.c:2927 tcp_ack+0xd55/0x1991()
> Mar 27 21:37:00 [kernel] WARNING: at net/ipv4/tcp_input.c:2927 tcp_ack+0xd55/0x1991()
> Mar 28 18:41:03 [kernel] WARNING: at net/ipv4/tcp_input.c:2927 tcp_ack+0xd55/0x1994()
> Mar 29 11:32:26 [kernel] WARNING: at net/ipv4/tcp_input.c:2927 tcp_ack+0xd5b/0x19a9()
> 
> The symptoms are always the same...

And the first logged entries date to something much earlier?

> Another observation is that all four WARNINGs involve only one of my
> two networks adapters, namely skge (build into my mobo). The other one
> (r8169) never occurs in the call trace.

Nothing interesting seemed to happen for skge in v2.6.29..
timeframe... :-(

> I will also try to reproduce the WARNING with Jon's patch applied.

Thanks.

-- 
 i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ