netdev - Re: [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking GSOed skbs when SACKed one-by-one

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Tue, 11 Dec 2007 04:32:39 -0800 (PST)
From:	David Miller <davem@...emloft.net>
To:	ilpo.jarvinen@...sinki.fi
Cc:	lachlan.andrew@...il.com, netdev@...r.kernel.org,
	quetchen@...tech.edu
Subject: Re: [RFC PATCH net-2.6.25 uncompilable] [TCP]: Avoid breaking
 GSOed skbs when SACKed one-by-one

From: "Ilpo_Järvinen" <ilpo.jarvinen@...sinki.fi>
Date: Tue, 11 Dec 2007 13:59:16 +0200 (EET)

> How about this...
> 
> ...I've left couple of FIXMEs there still, should be quite simple & 
> straightforward to handle them if this seems viable solution at all.
> 
> Beware, this doesn't even compile yet because not all parameters are 
> transferred currently (first_sack_index was killed earlier, I need to 
> summon it back for this). Also, I'd need to do the dirty work of kill 
> recv_sack_cache first to make this to not produce, well, interesting 
> effects due to missed SACK blocks... :-)

Interesting approach, but I think there is limited value to this
(arguably) complex form.

The core issue is that the data and the SACK state are maintained in
the same datastructure.  The complexity in all the state management
and fixups in your patch is purely because of this.

If we maintain SACK scoreboard information seperately, outside of
the SKB, then there are only two changes to make:

1) Every access to TCP_SKB_CB() SACK scoreboard is adjusted to
   new data structure.

2) Retransmit is adjusted so that it can retransmit an SKB
   constructed as a portion of an existing SKB.  Since TSO
   implies SG, this can be handled with simple offset and
   length arguments and suitable creation of a clone referencing
   the pages in the SG vector that contain the desired data.

I would envision this SACK state thing to reference into the
retransmit queue SKB's somehow.  Each SACK block could perhaps
look something like:

	struct sack_ref {
		struct sk_buff *start_skb;
		struct sk_buff *end_skb;
		unsigned int start_off;
		unsigned int len;
	};

Traditionally we've prioritized the design of the SKB and other
infrastructure to suit TCP optimally and I still think we should
operate that way.

Therefore, long term, it is time to make a formal data blob to assist
with all of the repetitive work we do in cases like this and horribly
inefficient places like clean_rtx_queue().

So I'm basically advocating a two-pronged approach to this, the
seperate SACK scoreboard datastructure and the data blob.  I
think we can work on the former right now, and take our time with
the data blob because it requires lots of thinking and we should
get it right as it might have network driver interface implications.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html