lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <85e72428-5ac4-6727-ab92-828286e5c789@fb.com>
Date:   Fri, 16 Mar 2018 15:37:21 -0700
From:   Yonghong Song <yhs@...com>
To:     Eric Dumazet <eric.dumazet@...il.com>,
        Alexei Starovoitov <ast@...com>,
        Steffen Klassert <steffen.klassert@...unet.com>,
        Daniel Borkmann <daniel@...earbox.net>
CC:     netdev <netdev@...r.kernel.org>, Martin Lau <kafai@...com>,
        <saeedm@...lanox.com>, <diptanu@...com>
Subject: Re: BUG_ON triggered in skb_segment


Eric and Daniel,

I have tried to fix this issue but not really successful.
I tried two hacks:
   . if skb_headlen(list_skb) is not 0, we just pull
     skb_headlen(list_skb) from the skb to make skb_headlen(list_skb) = 
0, or
   . if skb_headlen(list_skb) is not 0, we go to the beginning of
     the outer loop which will allocate another nskb for this list_skb.

Both approaches removed the BUG and the packet is able to reach the
remote host. Upon receiving the packet, however, the remote host sends a
reset packet back so connection eventually closed. I did not debug
further on this.

Considering it is tricky to change skb_segment, I hacked test_bpf
kernel module to reproduce the issue. The change reflects the gso packet
structure I got from mlx5. Maybe you could take a look and suggest a fix 
or a direction of how to move forward.

Thanks!

============= PATCH  ===============

-bash-4.2$ git show
commit 41681ab51f85b4a0ea3416a0a62d6bde74f3af4b
Author: Yonghong Song <yhs@...com>
Date:   Fri Mar 16 15:10:02 2018 -0700

     [hack] hack test_bpf module to trigger BUG_ON in skb_segment.

     "modprobe test_bpf" will have the following errors:
     ...
     [   98.149165] ------------[ cut here ]------------
     [   98.159362] kernel BUG at net/core/skbuff.c:3667!
     [   98.169756] invalid opcode: 0000 [#1] SMP PTI
     [   98.179370] Modules linked in:
     [   98.179371]  test_bpf(+)
     ...

     The BUG happens in function skb_segment:
     ...
     3665                 while (pos < offset + len) {
     3666                         if (i >= nfrags) {
     3667                                 BUG_ON(skb_headlen(list_skb));
     3668
     3669                                 i = 0;
     3670                                 nfrags = 
skb_shinfo(list_skb)->nr_frags;
     3671                                 frag = 
skb_shinfo(list_skb)->frags;
     3672                                 frag_skb = list_skb;
     3673
     3674                                 BUG_ON(!nfrags);
     ...

     The skbs are constructed to mimic what mlx5 may generate.
     The packet size/header may not mimic real cases in production. But
     the processing flow is similar.

     Signed-off-by: Yonghong Song <yhs@...com>

diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 2efb213..d36a991 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -6574,6 +6574,67 @@ static bool exclude_test(int test_id)
         return test_id < test_range[0] || test_id > test_range[1];
  }

+static struct sk_buff *build_test_skb(void) {
+       u32 tailroom = SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+       u32 headroom = NET_SKB_PAD + NET_IP_ALIGN + ETH_HLEN;
+       struct sk_buff *skb[2];
+       void *data[2], *page;
+       int i, data_size = 8;
+
+       page = (void *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
+       if (!page)
+               return NULL;
+
+       for (i = 0; i < 2; i++) {
+               data[i] = kzalloc(headroom + tailroom + data_size, 
GFP_KERNEL);
+               if (!data[i])
+                       return NULL;
+               skb[i] = build_skb(data[i], 0);
+               if (!skb[i]) {
+                       kfree(data[i]);
+                       return NULL;
+               }
+               skb_reserve(skb[i], headroom);
+               skb_put(skb[i], data_size);
+               skb[i]->protocol = htons(ETH_P_IP);
+               skb_reset_network_header(skb[i]);
+               skb_set_mac_header(skb[i], -ETH_HLEN);
+
+               skb_add_rx_frag(skb[i], skb_shinfo(skb[i])->nr_frags,
+                        page, 0, 64, 64);
+       }
+
+       /* setup shinfo */
+       skb_shinfo(skb[0])->gso_size = 1448;
+       skb_shinfo(skb[0])->gso_type = SKB_GSO_TCPV4;
+       skb_shinfo(skb[0])->gso_type |= SKB_GSO_DODGY;
+       skb_shinfo(skb[0])->gso_segs = 0;
+       skb_shinfo(skb[0])->frag_list = skb[1];
+
+       /* adjust skb[0]'s len */
+       skb[0]->len += skb[1]->len;
+       skb[0]->data_len += skb[1]->data_len;
+       skb[0]->truesize += skb[1]->truesize;
+
+       return skb[0];
+}
+
+static void test_skb_segment(void) {
+       netdev_features_t features;
+       struct sk_buff *skb;
+
+       features = NETIF_F_SG | NETIF_F_GSO_PARTIAL | NETIF_F_IP_CSUM | 
NETIF_F_IPV6_CSUM;
+       features |= NETIF_F_RXCSUM;
+       skb = build_test_skb();
+       if (!skb)
+               pr_info("Failed in test_skb_segment:build_test_skb!");
+
+       if (skb_segment(skb, features))
+               pr_info("Success in test_skb_segment!");
+       else
+               pr_info("Failed in test_skb_segment!");
+}
+
  static __init int test_bpf(void)
  {
         int i, err_cnt = 0, pass_cnt = 0;
@@ -6631,7 +6692,8 @@ static int __init test_bpf_init(void)
         if (ret < 0)
                 return ret;

-       ret = test_bpf();
+       // ret = test_bpf();
+       test_skb_segment();

         destroy_bpf_tests();
         return ret;


=============  END ================


On 3/13/18 6:15 PM, Eric Dumazet wrote:
> 
> 
> On 03/13/2018 05:35 PM, Eric Dumazet wrote:
>>
>>
>> On 03/13/2018 05:26 PM, Eric Dumazet wrote:
>>>
>>>
>>> On 03/13/2018 05:04 PM, Alexei Starovoitov wrote:
>>>> On 3/13/18 4:27 PM, Eric Dumazet wrote:
>>>>>
>>>>>
>>>>> On 03/13/2018 04:09 PM, Alexei Starovoitov wrote:
>>>>>
>>>>>> we have bpf_skb_proto_6_to_4() that was used by cilium for long time.
>>>>>> It's not clear why it's not crashing there, but we cannot just
>>>>>> reject changing proto in bpf programs now.
>>>>>> We have to fix whatever needs to be fixed in skb_segment
>>>>>> (if bug is there) or fix whatever necessary on mlx5 side.
>>>>>> In bpf helper we mark it as SKB_GSO_DODGY just like packets coming
>>>>>> through virtio would do, so if skb_segment() needs to do something
>>>>>> special with skb the SKB_GSO_DODGY flag is already there.
>>>>>
>>>>> 'Fixing' skb_segment(), I did that a long time ago and Herbert Xu was
>>>>> not happy with the fix and provided something else.
>>>>
>>>> any link to your old patches and discussion?
>>>>
>>>> I think since mlx4 can do tso on them and the packets came out
>>>> correct on the wire, there is nothing fundamentally wrong with
>>>> changing gso_size. Just tricky to teach skb_segment.
>>>>
>>>
>>> The world is not mlx4 only. Some NIC will ask skb_segment() fallback 
>>> segmentation for various reasons (like skb->len above a given limit 
>>> like 16KB)
>>>
>>> Maybe 
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg255549.html&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=x9lG2k53B7AFsnNx6t2DgbXt06J-sLznZIHk6YdGoGg&s=0Nxx2G8PtDAEMCFAvQ7kxYTXVr9aHdOolP1KB_lnmes&e= 
>>>
>>
>>
>> Herbert patch :
>>
>> commit 9d8506cc2d7ea1f911c72c100193a3677f6668c3
>> Author: Herbert Xu <herbert@...dor.apana.org.au>
>> Date:   Thu Nov 21 11:10:04 2013 -0800
>>
>>      gso: handle new frag_list of frags GRO packets
>>
> 
> I found my initial patch.
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.spinics.net_lists_netdev_msg255452.html&d=DwIDaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=DA8e1B5r073vIqRrFz7MRA&m=x9lG2k53B7AFsnNx6t2DgbXt06J-sLznZIHk6YdGoGg&s=VuWRpUdJwBwTxpnMNZYgKvQANLL5UA7hZnTFZsQlK6c&e= 
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ