netdev - Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0808181042290.23854@wrl-59.cs.helsinki.fi>
Date:	Tue, 19 Aug 2008 13:38:35 +0300 (EEST)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	"Dâniel Fraga" <fragabr@...il.com>
cc:	David Miller <davem@...emloft.net>, thomas.jarosch@...ra2net.com,
	billfink@...dspring.com, Netdev <netdev@...r.kernel.org>,
	Patrick Hardy <kaber@...sh.net>, sr@...urenet.de,
	netfilter-devel@...r.kernel.org, kadlec@...ckhole.kfki.hu
Subject: Re: [PATCH] tcp FRTO: in-order-only "TCP proxy" fragility workaround

On Sat, 16 Aug 2008, Dâniel Fraga wrote:

> On Sat, 16 Aug 2008 22:18:50 +0300 (EEST)
> "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi> wrote:
> 
> > I'll look through 2.6.24..25 history once I have some time to see if 
> > there are some clues about the cause. I'm also having a problem in 
> > figurin out why would the frto patch you tested solve this issue (unless 
> > there are two issues in the picture).
> 
> 	Ok, surely some patch between .24 and .25 caused this. Or it's
> some bug that only "appeared" in .25 :)
> 
> 	In fact, the frto patch helped, but not prevented the problem.
> I mean, it seems that with the frto patch, the problem doesn't happen
> frequently. And if I disable frto, the problem doesn't occur either.
> 
> 	But, maybe, we could be talking about another bug, completely
> unrelated to frto... I don't know. i'm just guessing ;). Anyway, we
> talk about stalled connections ;)
>
> 	What I know is:
> 
> 1) what you wrote is right: 2.6.24 is fine, 2.6.25 and 2.6.26 not
> 
> 2) nmap -sS <server> seems to reset the connection (it's my workaround
> until now ;). Maybe the ping probe help in some way? I don't know.

Perhaps, though it's not at all clear how it could do that...

> 	I want to help you as much as I can. So, ask anything you need.

I went through TCP related and inet_connection_sock related things, 
nothing obvious I could notice in there...

Do you have net namespaces enabled CONFIG_NET_NS in .config?

Any netfilter (iptables) rules on server which could cause those packets 
to not reach TCP layer?

MIBs might give some clue why those segments didn't get accepted. Most 
interesting ones are PAWSEstab, TCPAbortOnSyn and InErrs. One can use 
/bin/cut to read those from the one-line files if one wants to (however,
I attached a script which transposes them to get them somewhat 
human-readable). Also having the /proc/net/tcp output from the server 
while stalling would be (have been) useful to reveal state info (but I 
should have remembered to ask you to run it on both of them :-)). 

Also, I wonder what that [|tcp] hides, e.g., "<nop,nop,timestamp 
15980976 70381399,nop,nop,[|tcp]>" in tcpdump (and that was for an ACK 
which doesn't make too much sense to me there). It occurs because 
snaplen which was given for tcpdump is small enough to make TCP header 
partial.

-- 
 i.
Download attachment "readmibs.sh" of type "APPLICATION/X-SH" (793 bytes)