[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEP_g=8DkttniMXMP4N691CbGKo-ACqNsOJ+RwOAA8F101h9Uw@mail.gmail.com>
Date: Wed, 24 Apr 2013 10:51:37 -0700
From: Jesse Gross <jesse@...ira.com>
To: Simon Horman <horms@...ge.net.au>
Cc: "Rajahalme, Jarno (NSN - FI/Espoo)" <jarno.rajahalme@....com>,
"<dev@...nvswitch.org>" <dev@...nvswitch.org>,
"<netdev@...r.kernel.org>" <netdev@...r.kernel.org>,
Ravi K <rkerur@...il.com>,
Isaku Yamahata <yamahata@...inux.co.jp>,
Ben Pfaff <blp@...ira.com>
Subject: Re: [PATCH v2.26] datapath: Add basic MPLS support to kernel
On Wed, Apr 24, 2013 at 2:12 AM, Simon Horman <horms@...ge.net.au> wrote:
> On Tue, Apr 23, 2013 at 06:47:36PM -0700, Jesse Gross wrote:
>> On Tue, Apr 23, 2013 at 12:58 AM, Simon Horman <horms@...ge.net.au> wrote:
>> > On Tue, Apr 23, 2013 at 07:41:37AM +0000, Rajahalme, Jarno (NSN - FI/Espoo) wrote:
>> >>
>> >> On Apr 23, 2013, at 4:51 , ext Simon Horman wrote:
>> >>
>> >> > On Mon, Apr 22, 2013 at 01:55:43PM +0000, Rajahalme, Jarno (NSN - FI/Espoo) wrote:
>> >> >>
>> >> >> Here the skb_network_header is changed to point to the L3 header. Is it
>> >> >> significant that in some cases (?) mpls_stack_depth may remain at zero,
>> >> >> even when a MPLS header was in fact added? (See above).
>> >> >
>> >> > With the current code I believe there are the following cases:
>> >> >
>> >> > Input: non-MPLS skb: Output: network header and mac_len correspond to the
>> >> > beginning of the L3 headers
>> >> > Input: MPLS: Output: network header and mac_len correspond to the
>> >> > end of the L2 headers.
>> >> >
>> >> > In the case of MPLS output the end of the L2 headers and the beginning
>> >> > of the L3 headers will differ.
>> >> >
>> >> >
>> >> > As far as I know the network header and mac_len only need to correspond to
>> >> > the beginning of the L3 header if GSO segmentation will occur (actually,
>> >> > some proposed changes to the network stack are required, see "[PATCH 0/2]
>> >> > Small Modifications to GSO to allow segmentation of MPLS"). That only
>> >> > occurs if the skb is GSO. Which in turn can only occur if the recieved
>> >> > packet is non-MPLS. This is because the linux kernel doesn't support
>> >> > MPLS offloads on receive (or anywhere else for that matter).
>> >> >
>> >> > In the case that we have a non-MPLS skb the stack depth starts at zero and
>> >> > is tracked. This is used to update the network header and mac_len.
>> >> > Otherwise the stack depth is unknown and the network header and mac_len are
>> >> > left as-is, corresponding to the end of the L2 headers.
>> >> >
>> >> > Actually, it is possible to tighten up the if clause to be the following,
>> >> > as it is only necessary to update the network header and mac_len for GSO skbs.
>> >> >
>> >> > if (mpls_stack_depth && skb_is_gso(skb)) {
>> >> > ...
>> >> > }
>> >> >
>> >> > It is possible for us to find and track the MPLS stack depth for all cases
>> >> > and to update the network header and mac_len. However I don't think that
>> >> > there is any run-time benefit and it seems expensive to find out what the
>> >> > original stack depth was - I believe it would require parsing the MPLS
>> >> > entire stack for each packet.
>> >> >
>> >>
>> >> Thanks for explaining this.
>> >>
>> >> I think it would be better to keep updating the the network_header and mac_len for the Non-MPLS input packets regardless of the GSO status of the skb. It would be more consistent and less surprising.
>> >
>> > I agree entirely that it would be more consistent and less surprising.
>> > But I'm not sure if the cost is worth it.
>> >
>> > Jesse, do you have an opinion on this?
>>
>> In general, I would tend to agree with Jarno that keeping this
>> consistent would be significantly easier to understand. I think the
>> cost is probably not particularly high.
>
> I think it would be become high for large MPLS stack depths.
> But I'm happy to wear that if you are.
I guess, although in practice having more than 2 or 3 MPLS labels
would be extremely rare so I'm not sure that it's worth optimizing at
this point.
>> However, I also think that having different meanings for the layer
>> pointers inside and outside of OVS is not particularly ideal since it
>> makes the overall system harder to understand. Using network header
>> for the start of the MPLS stack might not be great since it means that
>> we couldn't really take advantage of any actual hardware offloading in
>> the future. Maybe we could use mac_len for that purpose and that would
>> keep things more consistent?
>
> To clarify, your suggestion is:
>
> mac_len: corresponds to the top of the MPLS stack
> network_header: corresponds to the bottom of the MPLS stack
>
> If so, yes I think that could work and I will see about making it so.
Yes, that's what I was thinking. (Although to be more precise,
network_header would point to the start of the L3 header, as it does
currently. In practice, that's the same as the bottom of the MPLS
stack in the vast majority of situations and in all cases that OVS
supports.)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists