lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160713073152.GC29661@penelope.isobedori.kobe.vergenet.net>
Date:	Wed, 13 Jul 2016 16:31:54 +0900
From:	Simon Horman <simon.horman@...ronome.com>
To:	pravin shelar <pshelar@....org>
Cc:	Linux Kernel Network Developers <netdev@...r.kernel.org>,
	ovs dev <dev@...nvswitch.org>
Subject: Re: [ovs-dev] [PATCH net-next v11 5/6] openvswitch: add layer 3
 flow/port support

Hi Pravin,

On Thu, Jul 07, 2016 at 01:54:15PM -0700, pravin shelar wrote:
> On Wed, Jul 6, 2016 at 10:59 AM, Simon Horman
> <simon.horman@...ronome.com> wrote:

...

> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index 12e8a8942a42..0001f651c934 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> > @@ -301,6 +301,43 @@ static int set_eth_addr(struct sk_buff *skb, struct sw_flow_key *flow_key,
> >         return 0;
> >  }
> >
> > +/* pop_eth does not support VLAN packets as this action is never called
> > + * for them.
> > + */
> > +static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key)
> > +{
> > +       skb_pull_rcsum(skb, ETH_HLEN);
> > +       skb_reset_mac_header(skb);
> > +       skb->mac_len -= ETH_HLEN;
> > +
> > +       invalidate_flow_key(key);
> > +       return 0;
> > +}
> This is changing l2 packet to l3 packet by reseting mac header. We
> need to unset mac header so that OVS key-extract can identify this
> packet later on, for example after recirc action.
> Other option would be keeping key is_layer3 consistent with packet.
> Push ethernet and pop ethernet action can unset and set the flag in
> flow key. So that OVS can keep track of packet headers by looking at
> packet key. When packet leaves OVS (in netdev-send) we can unset mac
> header according to this flag.

...

> > diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
> > index 0ea128eeeab2..86f2cfb19de3 100644
> > --- a/net/openvswitch/flow.c
> > +++ b/net/openvswitch/flow.c
> ...
> 
> > @@ -723,9 +729,17 @@ int ovs_flow_key_extract(const struct ip_tunnel_info *tun_info,
> >         key->phy.skb_mark = skb->mark;
> >         ovs_ct_fill_key(skb, key);
> >         key->ovs_flow_hash = 0;
> > +       key->phy.is_layer3 = skb->mac_len == 0;
> 
> I do not think mac_len can be used. mac_header needs to be checked.
> ...

Yes, indeed. The update to use skb_mac_header_was_set() here accidently
slipped into the following patch, sorry about that.

With that change in place I believe that this patch is internally
consistent because mac_header and mac_len are set correctly by the
call to key_extract() which is called by ovs_flow_key_extract() just
after where the excerpt above ends.

That said, I do think that it is possible to rely on skb_mac_header_was_set
throughout the datapath, including action processing etc... I have provided
an incremental patch - which I created on top of this entire series - at
the end of this email. If you prefer that approach I am happy to take it,
though I do feel that using mac_len leads to slightly cleaner code. Let me
know what you think.

> > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> > index c78a6a1476fb..fc94fbe1ddc3 100644
> > --- a/net/openvswitch/flow_netlink.c
> > +++ b/net/openvswitch/flow_netlink.c
> ...
> 
> > @@ -898,20 +922,33 @@ static int metadata_from_nlattrs(struct net *net, struct sw_flow_match *match,
> >                                    sizeof(*cl), is_mask);
> >                 *attrs &= ~(1ULL << OVS_KEY_ATTR_CT_LABELS);
> >         }
> > -       return 0;
> > -}
> >
> > -static int ovs_key_from_nlattrs(struct net *net, struct sw_flow_match *match,
> > -                               u64 attrs, const struct nlattr **a,
> > -                               bool is_mask, bool log)
> > -{
> > -       int err;
> > +       /* For layer 3 packets the ethernet type is provided
> > +        * and treated as metadata but no MAC addresses are provided.
> > +        */
> > +       if (*attrs & (1ULL << OVS_KEY_ATTR_ETHERTYPE) &&
> > +           !(*attrs & (1 << OVS_KEY_ATTR_ETHERNET))) {
> > +               int err;
> >
> Here attr_ethertype can be processed irrespective of attr_ethernet.
> is_layer3 can be derived independently. This way there is no need to
> again process attr_ethertyp in l2_from_nlattrs().

Thanks, I have reworked things as you suggest.

...

> > diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
> > index 4e3972344aa6..733e7914f6bd 100644
> > --- a/net/openvswitch/vport-netdev.c
> > +++ b/net/openvswitch/vport-netdev.c
> > @@ -57,8 +57,10 @@ static void netdev_port_receive(struct sk_buff *skb)
> >         if (unlikely(!skb))
> >                 return;
> >
> > -       skb_push(skb, ETH_HLEN);
> > -       skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> > +       if (vport->dev->type == ARPHRD_ETHER) {
> > +               skb_push(skb, ETH_HLEN);
> > +               skb_postpush_rcsum(skb, skb->data, ETH_HLEN);
> > +       }
> This is still required for tunnel device of ARPHRD_NONE which can
> handle l2 packets.

That is not necessary given the current implementation (of ipgre) as it
supplies an skb with the mac header in place if the inner packet was an
Ethernet packet. This scheme could of course be adjusted.

...



Update to use skb_mac_header_was_set() more as mentioned above.
Please let me know what you think about this approach.

 include/net/mpls.h                   |    4 ++-
 net/openvswitch/actions.c            |   42 ++++++++++++++++++++---------------
 net/openvswitch/flow.c               |   23 +++++++++++--------
 net/openvswitch/vport-internal_dev.c |    2 -
 net/openvswitch/vport-netdev.c       |    4 +--
 5 files changed, 44 insertions(+), 31 deletions(-)

diff --git a/include/net/mpls.h b/include/net/mpls.h
index 5b3b5addfb08..296b68661be0 100644
--- a/include/net/mpls.h
+++ b/include/net/mpls.h
@@ -34,6 +34,8 @@ static inline bool eth_p_mpls(__be16 eth_type)
  */
 static inline unsigned char *skb_mpls_header(struct sk_buff *skb)
 {
-	return skb_mac_header(skb) + skb->mac_len;
+	return skb_mac_header_was_set(skb) ?
+		skb_mac_header(skb) + skb->mac_len :
+		skb->data;
 }
 #endif
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 0001f651c934..a18feccb2baa 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -163,18 +163,20 @@ static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 		return -ENOMEM;
 
 	skb_push(skb, MPLS_HLEN);
-	skb_reset_mac_header(skb);
 
 	new_mpls_lse = (__be32 *)skb_mpls_header(skb);
 	*new_mpls_lse = mpls->mpls_lse;
 
 	skb_postpush_rcsum(skb, new_mpls_lse, MPLS_HLEN);
 
-	if (skb->mac_len) {
+	if (skb_mac_header_was_set(skb)) {
+		skb_reset_mac_header(skb);
+
 		update_ethertype(skb, eth_hdr(skb), mpls->mpls_ethertype);
 		memmove(skb_mac_header(skb) - MPLS_HLEN, skb_mac_header(skb),
 			skb->mac_len);
 	}
+
 	if (!skb->inner_protocol)
 		skb_set_inner_protocol(skb, skb->protocol);
 	skb->protocol = mpls->mpls_ethertype;
@@ -186,22 +188,18 @@ static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 		    const __be16 ethertype)
 {
-	int err;
-
-	err = skb_ensure_writable(skb, skb->mac_len + MPLS_HLEN);
-	if (unlikely(err))
-		return err;
-
-	skb_postpull_rcsum(skb, skb_mpls_header(skb), MPLS_HLEN);
+	if (skb_mac_header_was_set(skb)) {
+		struct ethhdr *hdr;
+		int err;
 
-	memmove(skb_mac_header(skb) + MPLS_HLEN, skb_mac_header(skb),
-		skb->mac_len);
+		skb_postpull_rcsum(skb, skb_mpls_header(skb), MPLS_HLEN);
 
-	__skb_pull(skb, MPLS_HLEN);
-	skb_reset_mac_header(skb);
+		err = skb_ensure_writable(skb, skb->mac_len + MPLS_HLEN);
+		if (unlikely(err))
+			return err;
 
-	if (skb->mac_len) {
-		struct ethhdr *hdr;
+		memmove(skb_mac_header(skb) + MPLS_HLEN, skb_mac_header(skb),
+			skb->mac_len);
 
 		/* skb_mpls_header() is used to locate the ethertype
 		 * field correctly in the presence of VLAN tags.
@@ -210,6 +208,11 @@ static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 		update_ethertype(skb, hdr, ethertype);
 	}
 
+	__skb_pull(skb, MPLS_HLEN);
+
+	if (skb_mac_header_was_set(skb))
+		skb_reset_mac_header(skb);
+
 	if (eth_p_mpls(skb->protocol))
 		skb->protocol = ethertype;
 
@@ -220,11 +223,14 @@ static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 static int set_mpls(struct sk_buff *skb, struct sw_flow_key *flow_key,
 		    const __be32 *mpls_lse, const __be32 *mask)
 {
+	__u16 mac_len;
 	__be32 *stack;
 	__be32 lse;
 	int err;
 
-	err = skb_ensure_writable(skb, skb->mac_len + MPLS_HLEN);
+	mac_len = skb_mac_header_was_set(skb) ? skb->mac_len : 0;
+
+	err = skb_ensure_writable(skb, mac_len + MPLS_HLEN);
 	if (unlikely(err))
 		return err;
 
@@ -307,7 +313,7 @@ static int set_eth_addr(struct sk_buff *skb, struct sw_flow_key *flow_key,
 static int pop_eth(struct sk_buff *skb, struct sw_flow_key *key)
 {
 	skb_pull_rcsum(skb, ETH_HLEN);
-	skb_reset_mac_header(skb);
+	skb_unset_mac_header(skb);
 	skb->mac_len -= ETH_HLEN;
 
 	invalidate_flow_key(key);
@@ -325,7 +331,7 @@ static int push_eth(struct sk_buff *skb, struct sw_flow_key *key,
 
 	skb_push(skb, ETH_HLEN);
 	skb_reset_mac_header(skb);
-	skb->mac_len += ETH_HLEN;
+	skb->mac_len = ETH_HLEN;
 
 	hdr = eth_hdr(skb);
 	ether_addr_copy(hdr->h_source, ethh->addresses.eth_src);
diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index 42587d5bf894..837ea4f9a71d 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -465,17 +465,18 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 	/* Flags are always used as part of stats */
 	key->tp.flags = 0;
 
-	skb_reset_mac_header(skb);
-
 	/* Link layer. */
 	key->eth.tci = 0;
 	if (key->phy.is_layer3) {
 		if (skb_vlan_tag_present(skb))
 			key->eth.tci = htons(skb->vlan_tci);
 		key->eth.type = skb->protocol;
+		skb_reset_network_header(skb);
 	} else {
 		struct ethhdr *eth = eth_hdr(skb);
 
+		skb_reset_mac_header(skb);
+
 		ether_addr_copy(key->eth.src, eth->h_source);
 		ether_addr_copy(key->eth.dst, eth->h_dest);
 
@@ -493,11 +494,11 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 		key->eth.type = parse_ethertype(skb);
 		if (unlikely(key->eth.type == htons(0)))
 			return -ENOMEM;
-	}
 
-	skb_reset_network_header(skb);
-	skb_reset_mac_len(skb);
-	__skb_push(skb, skb->data - skb_mac_header(skb));
+		skb_reset_network_header(skb);
+		skb_reset_mac_len(skb);
+		__skb_push(skb, skb->data - skb_mac_header(skb));
+	}
 
 	/* Network layer. */
 	if (key->eth.type == htons(ETH_P_IP)) {
@@ -608,12 +609,16 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 		 * header and the beginning of the L3 header differ.
 		 *
 		 * Advance network_header to the beginning of the L3
-		 * header. mac_len corresponds to the end of the L2 header.
+		 * header. For packets with an L2 header mac_len corresponds
+		 * to the end of the L2 header.
 		 */
 		while (1) {
+			__u16 mac_len;
 			__be32 lse;
 
-			error = check_header(skb, skb->mac_len + stack_len);
+			mac_len = key->phy.is_layer3 ? 0 : skb->mac_len;
+
+			error = check_header(skb, mac_len + stack_len);
 			if (unlikely(error))
 				return 0;
 
@@ -622,7 +627,7 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key)
 			if (stack_len == MPLS_HLEN)
 				memcpy(&key->mpls.top_lse, &lse, MPLS_HLEN);
 
-			skb_set_network_header(skb, skb->mac_len + stack_len);
+			skb_set_network_header(skb, mac_len + stack_len);
 			if (lse & htonl(MPLS_LS_S_MASK))
 				break;
 
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index 5ad184bd5802..a5ea0bcd310c 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -255,7 +255,7 @@ static netdev_tx_t internal_dev_recv(struct sk_buff *skb)
 	struct pcpu_sw_netstats *stats;
 
 	/* Only send/receive L2 packets */
-	if (!skb->mac_len) {
+	if (!skb_mac_header_was_set(skb)) {
 		kfree_skb(skb);
 		return -EINVAL;
 	}
diff --git a/net/openvswitch/vport-netdev.c b/net/openvswitch/vport-netdev.c
index 7d54414b35eb..05985209f611 100644
--- a/net/openvswitch/vport-netdev.c
+++ b/net/openvswitch/vport-netdev.c
@@ -201,9 +201,9 @@ int ovs_netdev_send(struct sk_buff *skb)
 {
 	struct net_device *dev = skb->dev;
 
-	if (dev->type != ARPHRD_ETHER && skb->mac_len) {
+	if (dev->type != ARPHRD_ETHER && skb_mac_header_was_set(skb)) {
 		skb->protocol = htons(ETH_P_TEB);
-	} else if (dev->type == ARPHRD_ETHER && !skb->mac_len) {
+	} else if (dev->type == ARPHRD_ETHER && !skb_mac_header_was_set(skb)) {
 		kfree_skb(skb);
 		return -EINVAL;
 	}

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ