[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130424091236.GA22998@verge.net.au>
Date: Wed, 24 Apr 2013 18:12:36 +0900
From: Simon Horman <horms@...ge.net.au>
To: Jesse Gross <jesse@...ira.com>
Cc: "Rajahalme, Jarno (NSN - FI/Espoo)" <jarno.rajahalme@....com>,
"<dev@...nvswitch.org>" <dev@...nvswitch.org>,
"<netdev@...r.kernel.org>" <netdev@...r.kernel.org>,
Ravi K <rkerur@...il.com>,
Isaku Yamahata <yamahata@...inux.co.jp>,
Ben Pfaff <blp@...ira.com>
Subject: Re: [PATCH v2.26] datapath: Add basic MPLS support to kernel
On Tue, Apr 23, 2013 at 06:47:36PM -0700, Jesse Gross wrote:
> On Tue, Apr 23, 2013 at 12:58 AM, Simon Horman <horms@...ge.net.au> wrote:
> > On Tue, Apr 23, 2013 at 07:41:37AM +0000, Rajahalme, Jarno (NSN - FI/Espoo) wrote:
> >>
> >> On Apr 23, 2013, at 4:51 , ext Simon Horman wrote:
> >>
> >> > On Mon, Apr 22, 2013 at 01:55:43PM +0000, Rajahalme, Jarno (NSN - FI/Espoo) wrote:
> >> >>
> >> >> Here the skb_network_header is changed to point to the L3 header. Is it
> >> >> significant that in some cases (?) mpls_stack_depth may remain at zero,
> >> >> even when a MPLS header was in fact added? (See above).
> >> >
> >> > With the current code I believe there are the following cases:
> >> >
> >> > Input: non-MPLS skb: Output: network header and mac_len correspond to the
> >> > beginning of the L3 headers
> >> > Input: MPLS: Output: network header and mac_len correspond to the
> >> > end of the L2 headers.
> >> >
> >> > In the case of MPLS output the end of the L2 headers and the beginning
> >> > of the L3 headers will differ.
> >> >
> >> >
> >> > As far as I know the network header and mac_len only need to correspond to
> >> > the beginning of the L3 header if GSO segmentation will occur (actually,
> >> > some proposed changes to the network stack are required, see "[PATCH 0/2]
> >> > Small Modifications to GSO to allow segmentation of MPLS"). That only
> >> > occurs if the skb is GSO. Which in turn can only occur if the recieved
> >> > packet is non-MPLS. This is because the linux kernel doesn't support
> >> > MPLS offloads on receive (or anywhere else for that matter).
> >> >
> >> > In the case that we have a non-MPLS skb the stack depth starts at zero and
> >> > is tracked. This is used to update the network header and mac_len.
> >> > Otherwise the stack depth is unknown and the network header and mac_len are
> >> > left as-is, corresponding to the end of the L2 headers.
> >> >
> >> > Actually, it is possible to tighten up the if clause to be the following,
> >> > as it is only necessary to update the network header and mac_len for GSO skbs.
> >> >
> >> > if (mpls_stack_depth && skb_is_gso(skb)) {
> >> > ...
> >> > }
> >> >
> >> > It is possible for us to find and track the MPLS stack depth for all cases
> >> > and to update the network header and mac_len. However I don't think that
> >> > there is any run-time benefit and it seems expensive to find out what the
> >> > original stack depth was - I believe it would require parsing the MPLS
> >> > entire stack for each packet.
> >> >
> >>
> >> Thanks for explaining this.
> >>
> >> I think it would be better to keep updating the the network_header and mac_len for the Non-MPLS input packets regardless of the GSO status of the skb. It would be more consistent and less surprising.
> >
> > I agree entirely that it would be more consistent and less surprising.
> > But I'm not sure if the cost is worth it.
> >
> > Jesse, do you have an opinion on this?
>
> In general, I would tend to agree with Jarno that keeping this
> consistent would be significantly easier to understand. I think the
> cost is probably not particularly high.
I think it would be become high for large MPLS stack depths.
But I'm happy to wear that if you are.
> However, I also think that having different meanings for the layer
> pointers inside and outside of OVS is not particularly ideal since it
> makes the overall system harder to understand. Using network header
> for the start of the MPLS stack might not be great since it means that
> we couldn't really take advantage of any actual hardware offloading in
> the future. Maybe we could use mac_len for that purpose and that would
> keep things more consistent?
To clarify, your suggestion is:
mac_len: corresponds to the top of the MPLS stack
network_header: corresponds to the bottom of the MPLS stack
If so, yes I think that could work and I will see about making it so.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists