lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1400533291.5367.67.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Mon, 19 May 2014 14:01:31 -0700
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Alexander Duyck <alexander.duyck@...il.com>
Cc:	David Miller <davem@...emloft.net>, amirv@...lanox.com,
	netdev@...r.kernel.org, idos@...lanox.com,
	jeffrey.t.kirsher@...el.com, jesse.brandeburg@...el.com,
	bruce.w.allan@...el.com, carolyn.wyborny@...el.com,
	donald.c.skidmore@...el.com, gregory.v.rose@...el.com,
	alexander.h.duyck@...el.com, john.ronciak@...el.com,
	mitch.a.williams@...el.com, yevgenyp@...lanox.com,
	ogerlitz@...lanox.com
Subject: Re: [PATCH net-next 1/2] net: Expose header length compution
 function

On Sat, 2014-05-10 at 14:53 -0700, Alexander Duyck wrote:

> I'm more of a fan of purpose built functions in hot-path.  In the case
> of skb_flow_dissect, it is meant to collect the inputs for a Jenkins
> hash.

Not really.

And having multiple flow dissectors is really a lot of trouble for us,
and contributes to code bloat.

>   If we also expand it to get the length my concern is that it may
> do both, but it won't be very efficient at doing either, and that
> doesn't even take into account that somebody at some point might want
> the flow dissector to not do things like coalesce IPv6 addresses to
> support things like a Toeplitz hash which would slow things down further.
> 
> I can wait for the patch. I don't really see what you're talking about
> since we are trying to linearize the header portion of the buffers and
> for jumbos frames all 2K of the buffer has been used so you can't do any
> tricks like use a paged frag for the head.


I've tested the following with a mlx4, and it indeed speeds up GRE
traffic, as GRO packets can now contain 17 MSS instead of 8.

(Pulling payload means GRO had to use 2 'frags' per MSS)

 drivers/net/ethernet/mellanox/mlx4/en_rx.c |   13 ++++++----
 include/linux/skbuff.h                     |    1 
 net/core/flow_dissector.c                  |   23 +++++++++++++++++++
 3 files changed, 32 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index e8c0d2b832b7..ff996b7c9a89 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -586,6 +586,7 @@ static struct sk_buff *mlx4_en_rx_skb(struct mlx4_en_priv *priv,
 		skb_copy_to_linear_data(skb, va, length);
 		skb->tail += length;
 	} else {
+		unsigned int hlen;
 		/* Move relevant fragments to skb */
 		used_frags = mlx4_en_complete_rx_desc(priv, rx_desc, frags,
 							skb, length);
@@ -595,16 +596,18 @@ static struct sk_buff *mlx4_en_rx_skb(struct mlx4_en_priv *priv,
 		}
 		skb_shinfo(skb)->nr_frags = used_frags;
 
+		hlen = eth_frame_headlen(va, length);
+
 		/* Copy headers into the skb linear buffer */
-		memcpy(skb->data, va, HEADER_COPY_SIZE);
-		skb->tail += HEADER_COPY_SIZE;
+		memcpy(skb->data, va, hlen);
+		skb->tail += hlen;
 
 		/* Skip headers in first fragment */
-		skb_shinfo(skb)->frags[0].page_offset += HEADER_COPY_SIZE;
+		skb_shinfo(skb)->frags[0].page_offset += hlen;
 
 		/* Adjust size of first fragment */
-		skb_frag_size_sub(&skb_shinfo(skb)->frags[0], HEADER_COPY_SIZE);
-		skb->data_len = length - HEADER_COPY_SIZE;
+		skb_frag_size_sub(&skb_shinfo(skb)->frags[0], hlen);
+		skb->data_len = length - hlen;
 	}
 	return skb;
 }
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7a9beeb1c458..4b3d8999a11c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3066,6 +3066,7 @@ bool skb_partial_csum_set(struct sk_buff *skb, u16 start, u16 off);
 int skb_checksum_setup(struct sk_buff *skb, bool recalculate);
 
 u32 __skb_get_poff(const struct sk_buff *skb);
+u32 eth_frame_headlen(void *data, unsigned int len);
 
 /**
  * skb_head_is_locked - Determine if the skb->head is locked down
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 107ed12a5323..55d4831493c1 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -323,6 +323,29 @@ u32 __skb_get_poff(const struct sk_buff *skb)
 	return poff;
 }
 
+
+/* Helper to find length of headers in an ethernet frame.
+ * This can help drivers to pull exact amount of bytes into
+ * skb->head to get optimal GRO performance.
+ * TODO: Could also return rxhash while we do a complete flow dissection.
+ */
+u32 eth_frame_headlen(void *data, unsigned int len)
+{
+	const struct ethhdr *eth = data;
+	struct sk_buff skb;
+
+	if (unlikely(len < ETH_HLEN))
+		return len;
+
+	skb.protocol = eth->h_proto;
+	skb.head = skb.data = data + ETH_HLEN;
+	skb_reset_network_header(&skb);
+	skb.len = len - ETH_HLEN;
+	skb.data_len = 0;
+	return __skb_get_poff(&skb) + ETH_HLEN;
+}
+EXPORT_SYMBOL(eth_frame_headlen);
+
 static inline int get_xps_queue(struct net_device *dev, struct sk_buff *skb)
 {
 #ifdef CONFIG_XPS


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ