[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0Ud6NJdVTQM0AiHVbhZTko9fnzyvrZ1juGRPrRFGC-5O-g@mail.gmail.com>
Date: Wed, 17 Aug 2016 18:06:42 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: David Ahern <dsa@...ulusnetworks.com>
Cc: Netdev <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Lennert Buytenhek <buytenh@...tstofly.org>,
Simon Horman <simon.horman@...ronome.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>, rshearma@...cade.com,
Tom Herbert <tom@...bertland.com>, Thomas Graf <tgraf@...g.ch>,
olivier.dugeon@...nge.com
Subject: Re: [PATCH net-next 2/3] net: mpls: Fixups for GSO
On Wed, Aug 17, 2016 at 4:23 PM, David Ahern <dsa@...ulusnetworks.com> wrote:
> On 8/17/16 5:16 PM, Alexander Duyck wrote:
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>> index 1ecbd7715f6d..6d78f162a88b 100644
>>> --- a/net/openvswitch/actions.c
>>> +++ b/net/openvswitch/actions.c
>>> @@ -167,6 +167,12 @@ static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
>>> skb->mac_len);
>>> skb_reset_mac_header(skb);
>>>
>>> + /* for GSO: set MPLS as network header and encapsulated protocol
>>> + * header as inner network header
>>> + */
>>> + skb_set_network_header(skb, skb->mac_len);
>>> + skb_set_inner_network_header(skb, skb->mac_len + MPLS_HLEN);
>>> +
>>> new_mpls_lse = (__be32 *)skb_mpls_header(skb);
>>> *new_mpls_lse = mpls->mpls_lse;
>>>
>>
>> So the one question I would have about this is how attached are you to
>> using the network_header to record the offset for the MPLS header? I
>> ask because I think from a hardware offloading perspective it would
>> make it much easier if instead you used the inner_mac_header to
>> represent the offset for the MPLS header. This way device drivers
>> could just skip over it like a VLAN and just use network and transport
>> header values like they would otherwise.
>>
>
> Where does the network_header relate to if I change the marker to inner_mac_header? Would it be skipped?
No, the network header would still be the network header.
> skb->protocol is set to MPLS.
> mac_header points to ethernet address
> network_header points to ???
The network_header would point to the IP header like it would be for a
non-MPLS frame.
> inner protocol is set to what is encapsulated (e.g., ipv4 or ipv6)
I am okay with this, but wonder if we actually need it. Do you know
of any protocols other than IPv4 or IPv6 that can be carried over MPLS
and would expect to be offloaded? If not we may be able to just get
away with recording the network header offset and then using the first
nibble of the network header to determine the IP version since the
value should be 4 or 6 for the two types we are offloading.
> inner_mac_header points to start of mpls label.
So this is what I would expect.
> inner_network points to start of network header.
The problem is that using inner_network_header to point to the network
header will require me to fork the path pretty significantly for most
of the Intel devices that would want to do MPLS GSO. The assumption
most drivers make is that if we are offloading things then
network_header and inner_network_header will point to either IPv4 or
IPv6 headers. Introducing MPLS as the network_header with IPv4 or
IPv6 as the inner_network_header throws a kink in the works because we
currently ignore inner_network_header for the devices that are doing
UDP or GRE tunnel GSO via GSO_PARTIAL with TSO_MANGLEID.
> Is that sufficient for h/w drivers?
I think of this as working like how we handle it for IP over IP
tunnels. In that case we are at L3 so the inner_network_header field
is populated, but the transport header stays the same. In the case of
MPLS it isn't really L3 it is more of an L2.5 so my preference would
be to treat it like it is an L2 tunnel or VLAN and just overwrite the
inner_mac_header with the MPLS header offset, and leave the network
and transport headers untouched.
One other bonus that also occurred to me is that you might be able to
get away with doing MPLS offloads for MPLS over IP or GRE tunnels. I
hadn't realized that MPLS inside of these tunnels was a thing, I had
just noticed it while looking over how the IP-in-IP tunnels are all
being handled. However if you move the header tracking to
inner_mac_header, and can avoid using skb->inner_protocol by instead
using the first nibble of the network_header value then you could
probably support segmenting those types of tunnels in hardware.
- Alex
Powered by blists - more mailing lists