lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 29 Feb 2008 14:24:59 +0200 (EET)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Bill Fink <billfink@...dspring.com>,
	Andrew Morton <akpm@...ux-foundation.org>
cc:	Guillaume Chazarain <guichaz@...il.com>,
	Giangiacomo Mariotti <giangiacomo_mariotti@...oo.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Netdev <netdev@...r.kernel.org>
Subject: Re: WARNING: at net/ipv4/tcp_input.c:2054 tcp_mark_head_lost()

On Thu, 28 Feb 2008, Bill Fink wrote:

> On Thu, 28 Feb 2008, Andrew Morton wrote:
> 
> > On Thu, 28 Feb 2008 10:22:27 +0200 (EET) "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi> wrote:
> > 
> > > [PATCH] TCP debug S+L (for 2.6.25-rcs, incompatible with 2.6.24.y)
> > > 
> > > ---
> > >  include/net/tcp.h     |    9 +++-
> > >  net/ipv4/tcp_input.c  |   18 +++++++-
> > >  net/ipv4/tcp_ipv4.c   |  127 +++++++++++++++++++++++++++++++++++++++++++++++++
> > >  net/ipv4/tcp_output.c |   23 +++++++--
> > 
> > I'll put this in -mm, see if we can flush anything out.

Ok, thanks. Were you aware of the considerable cpu consumption it will 
cause...? I.e., scanning throught the write queue in a number of place per 
ACK will certainly show up if somebody tests with netperf or so... ;-)
...Just please make sure it won't leak into mainline (for sure you would 
have done that without this explicit note :-)).

Good thing in that debug patch is that it catches inconsistencies 
immediately when they happen even if the cheap trap (which is in mainline) 
wouldn't ever see them because the situation would correct itself due to 
some other event.

> > Please let me know if/when it's obsolete, updated, etc.

Ok. Since many seem to now reporting this, I suppose the cause is 
relatively easy to find.

> > What is "S+L"?
> 
> I'll let Ilpo give the definitive answer.  But to test if I'm starting
> to grasp this, I'll give my understanding.  I believe 'S' means that a
> transmitted TCP skb has been acknowledged by a SACK, while 'L' means
> that a transmitted SKB is believed lost.  Since the 'S' state implies
> that the packet has actually been successfully received, it should not be
> possible for it to be considered lost ('L' state).  Thus an "S+L" state
> for a TCP skb is an internally inconsistent state and an indication of
> a TCP bug.
> 
> Anyone feel free to correct me if I'm way off base in my understanding.

Yes, this is exactly what it means. There's a big comment about them in 
the net/ipv4/tcp_input.c too. I answered to a similar question (but Bill 
mostly told all of it already):
  http://marc.info/?l=linux-netdev&m=120099888912383&w=2

We can do only cheap checking for sacked_out+lost_out > packets_out in 
mainline, and if that's true those warnings get printed but they won't 
necessarily tell the location of the bug because there might be 
considerable "latency" before that check triggers. On the other hand, this 
S+L debug patch verifies skb's ->sacked bitmaps against sacked/lost_out 
counters in multiple places per ACK and will catch the inconsistencies 
immediately at the site where they occurred (even if sacked_out + lost_out 
would still be below or equal to packets_out).



-- 
 i.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ