lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1384305223.28458.58.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Tue, 12 Nov 2013 17:13:43 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Herbert Xu <herbert@...dor.apana.org.au>
Cc:	Ben Hutchings <bhutchings@...arflare.com>,
	David Miller <davem@...emloft.net>,
	christoph.paasch@...ouvain.be, netdev@...r.kernel.org,
	hkchu@...gle.com, mwdalton@...gle.com
Subject: Re: gso: Handle new frag_list of frags GRO packets

On Tue, 2013-11-12 at 02:52 +0800, Herbert Xu wrote:
> Recently GRO started generating packets with frag_lists of frags.
> This was not handled by GSO, thus leading to a crash.
> 
> Thankfully these packets are of a regular form and are easy to
> handle.  This patch handles them in two ways.  For completely
> non-linear frag_list entries, we simply continue to iterate over
> the frag_list frags once we exhaust the normal frags.  For frag_list
> entries with linear parts, we call pskb_trim on the first part
> of the frag_list skb, and then process the rest of the frags in
> the usual way.
> 
> This patch also kills a chunk of dead frag_list code that has
> obviously never ever been run since it ends up generating a bogus
> GSO-segmented packet with a frag_list entry.
> 
> Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 3735fad..557e1a5 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -2776,6 +2776,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
>  	struct sk_buff *segs = NULL;
>  	struct sk_buff *tail = NULL;
>  	struct sk_buff *fskb = skb_shinfo(skb)->frag_list;
> +	skb_frag_t *skb_frag = skb_shinfo(skb)->frags;
>  	unsigned int mss = skb_shinfo(skb)->gso_size;
>  	unsigned int doffset = skb->data - skb_mac_header(skb);
>  	unsigned int offset = doffset;
> @@ -2815,16 +2816,38 @@ struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features)
>  		if (hsize > len || !sg)
>  			hsize = len;
>  
> -		if (!hsize && i >= nfrags) {
> -			BUG_ON(fskb->len != len);
> +		if (!hsize && i >= nfrags && skb_headlen(fskb) &&
> +		    (skb_headlen(fskb) == len || sg)) {
> +			BUG_ON(skb_headlen(fskb) > len);
> +

Hmm, yet another BUG_ON() case...

> +			i = 0;
> +			nfrags = skb_shinfo(fskb)->nr_frags;
> +			skb_frag = skb_shinfo(fskb)->frags;
> +			pos += skb_headlen(fskb);
> +
> +			while (pos < offset + len) {
> +				BUG_ON(i >= nfrags);
> +
> +				size = skb_frag_size(skb_frag);
> +				if (pos + size > offset + len)
> +					break;
> +
> +				i++;
> +				pos += size;
> +				skb_frag++;
> +			}
>  
> -			pos += len;
>  			nskb = skb_clone(fskb, GFP_ATOMIC);
>  			fskb = fskb->next;
>  
>  			if (unlikely(!nskb))
>  				goto err;
>  
> +			if (unlikely(pskb_trim(nskb, len))) {
> +				kfree_skb(nskb);
> +				goto err;
> +			}
> +

Note this pskb_trim() will reallocate/copy nskb head completely, since
nskb is a clone. (And increment page counts of frags, then eventually
decrement them)

I tested this patch one 'router', and it seems fine, although we consume
~90% more cpu doing the skb_segment() than not doing it.


GRO not building frag_list skbs :

lpaa6:~# mpstat -P 8 1 10
05:09:47 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle    intr/s
05:09:48 PM    8    0.00    0.00    1.01    0.00    3.03   23.23    0.00    0.00   72.73  43462.63
05:09:49 PM    8    0.00    0.00    0.00    0.00    5.10   28.57    0.00    0.00   66.33  88079.59
05:09:50 PM    8    0.00    0.00    0.00    0.00    2.13   17.02    0.00    0.00   80.85  41297.87
05:09:51 PM    8    0.00    0.00    0.95    0.00    3.81   29.52    0.00    0.00   65.71  45741.90
05:09:52 PM    8    0.00    0.00    0.00    0.00    2.11   17.89    0.00    0.00   80.00  25413.68
05:09:53 PM    8    1.03    0.00    1.03    0.00    2.06   20.62    0.00    0.00   75.26  36131.96
05:09:54 PM    8    0.00    0.00    0.94    0.00    3.77   30.19    0.00    0.00   65.09  47100.00
05:09:55 PM    8    0.00    0.00    0.00    0.00    3.26   21.74    0.00    0.00   75.00  71805.43
05:09:56 PM    8    0.00    0.00    0.00    0.00    3.19   22.34    0.00    0.00   74.47  70672.34
05:09:57 PM    8    0.00    0.00    0.00    0.00    4.50   32.43    0.00    0.00   63.06  45919.82
Average:       8    0.10    0.00    0.40    0.00    3.33   24.62    0.00    0.00   71.54  51339.66


Current GRO (large skbs -> need to split them with skb_segment(), no TSO is used)

lpaa6:~# mpstat -P 8 1 10
05:10:05 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest   %idle    intr/s
05:10:06 PM    8    0.00    0.00    0.00    0.00    8.16   44.90    0.00    0.00   46.94  88982.65
05:10:07 PM    8    0.00    0.00    1.05    0.00    8.42   50.53    0.00    0.00   40.00  54796.84
05:10:08 PM    8    0.00    0.00    1.94    0.00    8.74   52.43    0.00    0.00   36.89  50163.11
05:10:09 PM    8    0.00    0.00    0.00    0.00    7.14   39.80    0.00    0.00   53.06  85137.76
05:10:10 PM    8    0.00    0.00    0.00    0.00    8.08   44.44    0.00    0.00   47.47  42262.63
05:10:11 PM    8    0.00    0.00    0.00    0.00    8.00   53.00    0.00    0.00   39.00  53444.00
05:10:12 PM    8    0.00    0.00    0.00    0.00    5.00   27.50    0.00    0.00   67.50  91098.75
05:10:13 PM    8    0.00    0.00    0.00    0.00    8.55   47.86    0.00    0.00   43.59  34316.24
05:10:14 PM    8    0.00    0.00    0.00    0.00    8.70   48.91    0.00    0.00   42.39  56921.74
05:10:15 PM    8    0.00    0.00    0.93    0.00    7.41   46.30    0.00    0.00   45.37  77129.63
Average:       8    0.00    0.00    0.40    0.00    7.88   45.96    0.00    0.00   45.76  62458.99



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ