lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140507134620.GF4686@cpaasch-mac>
Date:	Wed, 7 May 2014 15:46:20 +0200
From:	Christoph Paasch <christoph.paasch@...ouvain.be>
To:	Octavian Purdila <octavian.purdila@...el.com>
Cc:	David Miller <davem@...emloft.net>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [RFC] tcp: add support for scheduling TCP options on TCP sockets

Hello Octavian,

On 07/05/14 - 07:30:23, Octavian Purdila wrote:
> On Wed, May 7, 2014 at 8:38 AM, David Miller <davem@...emloft.net> wrote:
> > Sorry I don't like this.
> >
> > Walking a linked list unnecessary is going to add overhead to every
> > single packet transmission.  I think more people want our TCP stack to
> > be fast (everyone) than those who want option processing to be
> > abstracted enough to be modular (you).
> >
> > Just make the intrusive changes, they are necessary as they force you
> > to think fully about how one option might interact with another.
> >
> 
> Unfortunately skb_tcp_cb does not have enough space to hold
> information for new large options. To work around that, the MPTCP
> implementation is pushing the option data in the skb and then
> occasionally uses the following when the pskb_copy is used:
> 
> -               else
> +               if (unlikely(skb_cloned(skb))) {
> +                       struct sk_buff *newskb;
> +                       if (mptcp_is_data_seq(skb))
> +                               skb_push(skb, MPTCP_SUB_LEN_DSS_ALIGN +
> +                                             MPTCP_SUB_LEN_ACK_ALIGN +
> +                                             MPTCP_SUB_LEN_SEQ_ALIGN);
> +
> +                       newskb = pskb_copy(skb, gfp_mask);
> +
> +                       if (mptcp_is_data_seq(skb)) {
> +                               skb_pull(skb, MPTCP_SUB_LEN_DSS_ALIGN +
> +                                             MPTCP_SUB_LEN_ACK_ALIGN +
> +                                             MPTCP_SUB_LEN_SEQ_ALIGN);
> +                               if (newskb)
> +                                       skb_pull(newskb,
> MPTCP_SUB_LEN_DSS_ALIGN +
> +
> MPTCP_SUB_LEN_ACK_ALIGN +
> +
> MPTCP_SUB_LEN_SEQ_ALIGN);
> +                       }
> +                       skb = newskb;
> +               } else {
>                         skb = skb_clone(skb, gfp_mask);
> +               }
> 
> MPTCP has many other intrusive changes in the TCP stack. To avoid that
> complexity, we could do the bulk of the implementation in a separate
> layer, on top of TCP. But we would need a mechanism to pass the
> options down to the TCP layer somehow.

Why not extend the head-space of the linear data of the skb as we discussed
already previously on mptcp-dev? Just in a similar way as 'struct can_skb_priv'
is being used. This would avoid this expensive list-processing and clean up
the above example you give.

Or did something else prevented to do it in such a way?

> > I also disagree with the "if the option doesn't fit, send it in the
> > next packet" idea.  Where did that come from?
> >
> > For SACK for example, that doesn't make any sense, and it's SACK
> > that usually can put us past the amount of space available.  For
> > SACK the thing to do is send the SACK information for the area
> > closest to what we've fully ACKd and just forget about advertising
> > the rest of the SACK blocks.
> 
> It is useful for MPTCP. A few MPTCP options are 12-20 bytes but if you
> already have timestamps and SACK in the packet there is not enough
> space to include it. However, MPTCP does not bound you to sent the
> options immediately, they can be sent at a later time.

Be careful which options you defer to a later time.

For the DSS-option it is better to temporarily reduce the number of 
SACK-blocks to 2 instead of ommitting the DSS-option. Otherwise, you
need to support at the receiver late receival of DSS-options inside pure acks 
(probably out-of-order) - and thus be able to store a list of DSS-options.

Only ADD_ADDR/REMOVE_ADDR and MP_PRIO should be allowed to be sent at a
'later' time. But for them you can also explicitly schedule an ack.


Cheers,
Christoph

> The patch may look like an "abstraction bloat", but it is just a way
> of trying to to keep adding more complexity in the TCP stack when
> implementing MPTCP. If people have other ideas of how can we
> accomplish that, so that we can integrate MPTCP upstream, I am keenly
> looking for suggestions.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ