[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-id: <20181213170021.GZ41383@MacBook-Pro-19.local>
Date: Thu, 13 Dec 2018 09:00:21 -0800
From: Christoph Paasch <cpaasch@...le.com>
To: Florian Westphal <fw@...len.de>
Cc: Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org,
peter.krystad@...el.com, mathew.j.martineau@...ux.intel.com
Subject: Re: [PATCH net-next 02/13] sk_buff: add skb extension infrastructure
On 13/12/18 - 11:39:18, Florian Westphal wrote:
> Eric Dumazet <eric.dumazet@...il.com> wrote:
> > > If its going to be used as I expect, then the extension could be
> > > discarded after the DSS mapping has been written to the tcp option
> > > space, i.e. before cloning occurs.
> >
> > I do not see how this would work, without also discarding on the master skb
> > the needed info.
>
> Ok, so lets assume this would result in one atomic_inc/dec due to clone
> for now for skbs coming from mptcp socket.
>
> But I don't see why this would have to be.
>
> > > For TCP, thats true. But there are other places that could clone, e.g.
> > > when bridge has to flood-forward.
> > >
> >
> > So you propose a mechanism that forces a preserve on clone, base on existing needs
> > for bridging.
>
> secpath does the same thing:
>
> static void __copy_skb_header(struct sk_buff *new, const struct sk_buff *old)
> {
> ...
> #ifdef CONFIG_XFRM
> new->sp = secpath_get(old->sp);
> #endif
> ...
>
> So I am not proposing anything new.
>
> > > At least in bridge case the 'preseve on clone' is needed, else required
> > > information is missing from the cloned skb.
> > >
> >
> > We need something where MPTCP info does not need to be propagated all the way to the NIC...
>
> Thats whats done in the MPTCP out-of-tree implementation, but I don't
> think its needed.
Yes, it indeed does not need to go all the way down to the NIC.
The info basically "just" needs to be propagated from the MPTCP-layer down
to the TCP-option space. Thus, it needs to remain on the skbs that are
sitting in the TCP-subflow's send-queue and rexmit tree as we need it when
retransmitting.
In tcp_transmit_skb, the clone is done at the beginning. Thus, we could for
example not inc the refcount on the clone and simply pass a pointer to the
original skb to tcp_established_options.
That way it the DSS option stays within the MPTCP/TCP layer and does not
make it down to the NIC.
Christoph
>
> It could just delete the extension before ->queue_xmit() AFAIU.
>
> > This skb extension is an incentive for adding more sticky things in the skbs
> > to violate layering of networking stacks :/
>
> 8-(
>
> Where do you see "layering violations"?
Powered by blists - more mailing lists