lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 5 Nov 2014 18:17:59 +0200
From:	Or Gerlitz <ogerlitz@...lanox.com>
To:	Florian Westphal <fw@...len.de>, <netdev@...r.kernel.org>,
	Tom Herbert <therbert@...gle.com>,
	Jesse Gross <jesse@...ira.com>
CC:	<amirv@...lanox.com>
Subject: Re: mlx4+vxlan offload breaks gre tunnels

On 11/5/2014 5:04 PM, Florian Westphal wrote:
> tl,dr: all tcp packets sent via gre tunnel have broken tcp csum if vxlan offload
> is enabled with mlx4 driver.
>
> Given following config on tx-side:
> dev=enp3s0
> ip addr add dev $dev 192.168.23.1/24
> ip link set $dev up
> ip link add mygre type gretap remote 192.168.23.2 local 192.168.23.1
> ip addr add dev mygre 192.168.42.1/24
> ip link set gre0 up
> ip link set mygre up
>
> and
>
> options mlx4_core log_num_mgm_entry_size=-1 debug_level=1
> port_type_array=2,2
>
> in
> /etc/modprobe.d/mlx4.conf
>
> all tcp packets sent to destinations over the gre tunnel have bogus tcp
> checksums (and are tossed on rx side when stack validates tcp checksum).
>
> net-next head is commit 30349bdbc4da5ecf0efa25556e3caff9c9b8c5f7 .
>
> What makes things work for me:
> either
>
> options mlx4_core 1 debug_level=1 port_type_array=2,2
>
> (ie. no MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)
>
> or not setting NETIF_F_IP_CSUM in enc_features:
>
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -2579,10 +2579,12 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, int port,
>                  dev->priv_flags |= IFF_UNICAST_FLT;
>   
>          if (mdev->dev->caps.tunnel_offload_mode == MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
> -               dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
> +               dev->hw_enc_features |= NETIF_F_RXCSUM |
>                                          NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL;
>
> I am not sure if its right fix, but to my eyes this basically looks like
> mlx4 is telling stack that it can handle tcp checksum offload within
> tunnels, and that doesn't seem to be the case for all types (e.g. gre).
>
> Could someone who understand the enc_features specifics better confirm that
> above patch is correct (or provide a better/proper fix)?

Yep, I can see now the problem. It comes into play with ConnectX3-pro 
NICs that support VXLAN offloads (but not with ConnectX3 NIC which 
don't) when you enable the offloads support on the CX3-pro.

The problem originates from the fact that we can't advertize something 
like "the HW can offload the inner checksum of UDP/VXLAN encapsulated 
(but not for GRE)", e.g in a similar manner that exists in the GSO 
space, where you have NETIF_F_GSO _YYY for each yyy in {UDP, SIT, GRE, 
etc} tunneling scheme.

I think the best effort we can do now is

1. come up with something such as the below patch for 3.18 which is 
back-ward portable for -stable kernels, it will only arm the hw offloads 
if the OS tells us there's VXLAN in action

2. come  up with proper kernel APIs to let NICs advertize which encap 
schemes they can actually offload the inner checksum, Tom... your work 
which now runs over netdev.

Tom/Jesse- thoughts? are you +1-ing the below approach?

Or.

tested to work with the  following which is a bit different, tell me if 
it works for you

# node A - with mlx4_en address192.168.31.18
ip tunnel add gre1 mode gre local 192.168.31.18 remote 192.168.31.17 ttl 255
ifconfig gre1 10.10.10.18/24 up
ifconfig gre1 mtu 1450

# node B - with mlx4_en address192.168.31.17
ip tunnel add gre1 mode gre local 192.168.31.17 remote 192.168.31.18 ttl 255
ifconfig gre1 10.10.10.17/24 up
ifconfig gre1 mtu 1450


diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c 
b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 0efbae9..7753833 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -2292,6 +2292,12 @@ static void mlx4_en_add_vxlan_offloads(struct 
work_struct *work)
  out:
         if (ret)
                 en_err(priv, "failed setting L2 tunnel configuration 
ret %d\n", ret);
+
+       /* set offloads */
+       priv->dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
+                                     NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL;
+       priv->dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
+       priv->dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
  }

  static void mlx4_en_del_vxlan_offloads(struct work_struct *work)
@@ -2299,6 +2305,10 @@ static void mlx4_en_del_vxlan_offloads(struct 
work_struct *work)
         int ret;
         struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
vxlan_del_task);
+       /* unset offloads */
+       priv->dev->hw_enc_features = 0;
+       priv->dev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
+       priv->dev->features    &= ~NETIF_F_GSO_UDP_TUNNEL;

         ret = mlx4_SET_PORT_VXLAN(priv->mdev->dev, priv->port,
                                   VXLAN_STEER_BY_OUTER_MAC, 0);
@@ -2578,13 +2588,6 @@ int mlx4_en_init_netdev(struct mlx4_en_dev *mdev, 
int port,
         if (mdev->dev->caps.steering_mode != MLX4_STEERING_MODE_A0)
                 dev->priv_flags |= IFF_UNICAST_FLT;

-       if (mdev->dev->caps.tunnel_offload_mode == 
MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
-               dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
-                                       NETIF_F_TSO | 
NETIF_F_GSO_UDP_TUNNEL;
-               dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
-               dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
-       }
-
         mdev->pndev[port] = dev;

         netif_carrier_off(dev);

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ