netdev - Re: [PATCH] tcp: fix tcp_retransmit_skb() to maintain MSS invariant

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.1203051051440.2966@wel-95.cs.helsinki.fi>
Date:	Mon, 5 Mar 2012 11:31:05 +0200 (EET)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Neal Cardwell <ncardwell@...gle.com>
cc:	Yuchung Cheng <ycheng@...gle.com>,
	David Miller <davem@...emloft.net>,
	Netdev <netdev@...r.kernel.org>,
	Nandita Dukkipati <nanditad@...gle.com>,
	Tom Herbert <therbert@...gle.com>
Subject: Re: [PATCH] tcp: fix tcp_retransmit_skb() to maintain MSS
 invariant

On Sat, 3 Mar 2012, Neal Cardwell wrote:

> On Fri, Mar 2, 2012 at 10:05 PM, Yuchung Cheng <ycheng@...gle.com> wrote:
> > On Fri, Mar 2, 2012 at 12:25 PM, Ilpo Järvinen
> > <ilpo.jarvinen@...sinki.fi> wrote:
> >> Why would we be splitting SACKed segment there?!? ...Ah, it seems that
> >> it indeed is possible now after the non-FACK mode fragment, however, it
> >> is totally useless work and seems to be breaking stuff. I think that the
> >> proper solution is to prevent splitting of already SACKed stuff. ...Nice
> >> find btw. :-)
> >
> > Bingo! I totally agree this is the bug. looks like a one-line patch now.
> 
> Agreed. I sent out a proposal:
>   http://patchwork.ozlabs.org/patch/144408/
> 
> Thanks, Ilpo, for filling in the details.

This unnecessary fragmenting probably also ate up some of the benefits 
from the recombining efforts done by shift/merge.

> The comment in
> tcp_match_skb_to_sack() that says "Round if necessary so that SACKs
> cover only full MSSes and/or the remaining small portion (if
> present)", plus the analogous MSS alignment logic in
> tcp_mss_split_point(), had me convinced that this MSS alignment
> invariant that was needed for tcp_match_skb_to_sack() to work
> correctly was actually intended.

tcp_mss_split_point is there to work around received window non-MSS 
oddities (when the whole window is almost fully utilized), plus some 
logic to enforce nagle marker skb. That stuff always works with unsent 
stuff anyway so it's much simpler to handle than skbs where we've already 
made some real commitments.

tcp_match_skb_to_sack tries to find the original mss boundaries that 
gso/tso generated and align into them if SACK range does contain 
non-aligned octets. Again, this is there to avoid pcount tweaks that would 
be necessary if we split to two skbs that have less than MSS sized tail 
part, and also to prevent receiver driven split to very small skbs 
attacks. ...Hopefully it also serves as an incentive to fix those 
middleboxes/end hosts which are in violation here by not maintaining the 
original packet boundaries :-).

However, now that I look it in this new light also tcp_match_skb_to_sack 
is subject to similar kind of bug, ie., it can fragment already 
shifted/merged skbs (although it's not that likely one would get a 
scenario to actually trigger that, requires partial re-SACK). Fixing this 
will be significantly more complicated though, and seems to require that 
sacktag logic inspects ->sacked much earlier making it again harder to 
follow :-(.

-- 
 i.