netdev - Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3 flow/port support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 18 Jul 2016 13:50:27 +0900
From:	Simon Horman <simon.horman@...ronome.com>
To:	pravin shelar <pshelar@....org>
Cc:	Linux Kernel Network Developers <netdev@...r.kernel.org>,
	ovs dev <dev@...nvswitch.org>, Jiri Benc <jbenc@...hat.com>
Subject: Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3
 flow/port support

[CC Jiri Benc for portion regarding GRE]

Hi Pravin,

On Fri, Jul 15, 2016 at 02:07:37PM -0700, pravin shelar wrote:
> On Wed, Jul 13, 2016 at 12:31 AM, Simon Horman
> <simon.horman@...ronome.com> wrote:
> > Hi Pravin,
> >
> > On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
> >> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
> >> <simon.horman@...ronome.com> wrote:
> >
> > ...
> 
> >
> >> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> >> > index 0ea128eeeab2..86f2cfb19de3 100644
> >> > --- a/net/openvswitch/flow.c
> >> > +++ b/net/openvswitch/flow.c
> >> ...
> >>
> >> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
> >> >         key->phy.skb_mark = skb->mark;
> >> >         ovs_ct_fill_key(skb, key);
> >> >         key->ovs_flow_hash = 0;
> >> > +       key->phy.is_layer3 = skb->mac_len == 0;
> >>
> >> I do not think mac_len can be used. mac_header needs to be checked.
> >> ...
> >
> > Yes, indeed. The update to use skb_mac_header_was_set() here accidently
> > slipped into the following patch, sorry about that.
> >
> > With that change in place I believe that this patch is internally
> > consistent because mac_header and mac_len are set correctly by the
> > call to key_extract() which is called by ovs_flow_key_extract() just
> > after where the excerpt above ends.
> >
> > That said, I do think that it is possible to rely on skb_mac_header_was_set
> > throughout the datapath, including action processing etc... I have provided
> > an incremental patch - which I created on top of this entire series - at
> > the end of this email. If you prefer that approach I am happy to take it,
> > though I do feel that using mac_len leads to slightly cleaner code. Let me
> > know what you think.
> >
> 
> 
> I am not sure if you can use only mac_len to detect L3 packet. This
> does not work with MPLS packets, mac_len is used to account MPLS
> headers pushed on skb. Therefore in case of a MPLS header on L3
> packet, mac_len would be non zero and we have to look at either
> mac_header or some other metadata like is_layer3 flag from key to
> check for L3 packet.

At least within OvS mac_len does not include the length of the MPLS label
stack. Rather, the MPLS label stack length is the difference between the
end of (mac_header + mac_len) and network_header.

So I think that the scheme does work as mac_len is 0 if there is no L2
header regardless of if an MPLS label stack is present or not.

> >> > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> >> > index 4e3972344aa6..733e7914f6bd 100644
> >> > --- a/net/openvswitch/vport-netdev.c
> >> > +++ b/net/openvswitch/vport-netdev.c
> >> > @@ -57,8 +57,10 @@ static void netdev_port_receive(struct sk_buff *skb)
> >> >         if (unlikely(!skb))
> >> >                 return;
> >> >
> >> > -       skb_push(skb, ETH_HLEN);
> >> > -       skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> >> > +       if (vport->dev->type == ARPHRD_ETHER) {
> >> > +               skb_push(skb, ETH_HLEN);
> >> > +               skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> >> > +       }
> >> This is still required for tunnel device of ARPHRD_NONE which can
> >> handle l2 packets.
> >
> > That is not necessary given the current implementation (of ipgre) as it
> > supplies an skb with the mac header in place if the inner packet was an
> > Ethernet packet. This scheme could of course be adjusted.
> >
> > ...
> >
> 
> I think we should send L2 header with l2 header pushed on skb. This is
> what OVS expect. The skb-push should be done for all l2 packets rather
> than for particular type of device.

The following seems to achieve that.
Jiri, what do you think?

diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
index a20248355da0..edbc10690b60 100644
--- a/net/ipv4/ip_gre.c
+++ b/net/ipv4/ip_gre.c
@@ -281,10 +281,9 @@ static int __ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi,
 					   raw_proto, false) < 0)
 			goto drop;
 
-		if (tunnel->dev->type != ARPHRD_NONE)
+		if (tunnel->dev->type != ARPHRD_NONE ||
+		    tpi->proto == htons(ETH_P_TEB))
 			skb_pop_mac_header(skb);
-		else if (tpi->proto != htons(ETH_P_TEB))
-			skb_unset_mac_header(skb);
 		else
 			skb_reset_mac_header(skb);
 		if (tunnel->collect_md) {
@@ -326,7 +325,7 @@ static int ipgre_rcv(struct sk_buff *skb, const struct tnl_ptk_info *tpi,
 		 * also ETH_P_TEB traffic.
 		 */
 		itn = net_generic(net, ipgre_net_id);
-		res = __ipgre_rcv(skb, tpi, itn, hdr_len, true);
+		res = __ipgre_rcv(skb, tpi, itn, hdr_len, false);
 	}
 	return res;
 }
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 05985209f611..933ac46db53a 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -57,7 +57,8 @@ static void netdev_port_receive(struct sk_buff *skb)
 	if (unlikely(!skb))
 		return;
 
-	if (vport->dev->type == ARPHRD_ETHER) {
+	if (vport->dev->type != ARPHRD_NONE ||
+	    skb->protocol == htons(ETH_P_TEB)) {
 		skb_push(skb, ETH_HLEN);
 		skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
 	}

> > Update to use skb_mac_header_was_set() more as mentioned above.
> > Please let me know what you think about this approach.
> >
> >  include/net/mpls.h                   |    4 ++-
> >  net/openvswitch/actions.c            |   42 ++++++++++++++++++++---------------
> >  net/openvswitch/flow.c               |   23 +++++++++++--------
> >  net/openvswitch/vport-internal_dev.c |    2 -
> >  net/openvswitch/vport-netdev.c       |    4 +--
> >  5 files changed, 44 insertions(+), 31 deletions(-)
> >
> > diff --git a/include/net/mpls.h b/include/net/mpls.h
> > index 5b3b5addfb08..296b68661be0 100644
> > --- a/include/net/mpls.h
> > +++ b/include/net/mpls.h
> > @@ -34,6 +34,8 @@ static inline bool eth_p_mpls(__be16 eth_type)
> >   */
> >  static inline unsigned char *skb_mpls_header(struct sk_buff *skb)
> >  {
> > -       return skb_mac_header(skb) + skb->mac_len;
> > +       return skb_mac_header_was_set(skb) ?
> > +               skb_mac_header(skb) + skb->mac_len :
> > +               skb->data;
> >  }
> 
> This function is also called from GSO layer. issue is in GSO layer, it
> does reset mac header and mac length and then calls mpls-gso-handler.
> So all subsequent check for L3 packet fails.
> So far we have explored three different ways to detect L3 packet but
> each has its own issue.
> 1. skb mac header : GSO can reset mac header.
> 2. skb mac length : MPLS uses mac_len to account for MPLS header
> length along with L2 header
> 3. skb protocol: ETH_P_TEB is not set for all L2 frames, networking
> stack is not ready to handle this type for given skb.
> 
> So none of them works consistently. I think the only option to detect
> L3 packet reliably (and without adding field to skb) is to use
> skb-protocol along with ARPHRD_NONE device type. If ARPHRD_NONE type
> device generates L2 packet it needs to set protocol to ETH_P_TEB. Some
> networking stack function also needs to be fixed to handle this
> protocol type, e.g. vlan_get_protocol(), br_dev_queue_push_xmit(),
> etc.