lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <637ce8e1-89e5-4254-a928-4ab2a4eb8c29@CMEXHTCAS1.ad.emulex.com>
Date:	Thu, 6 Nov 2014 07:21:57 +0000
From:	Sathya Perla <Sathya.Perla@...lex.Com>
To:	Or Gerlitz <ogerlitz@...lanox.com>,
	Florian Westphal <fw@...len.de>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	Tom Herbert <therbert@...gle.com>,
	Jesse Gross <jesse@...ira.com>
CC:	"amirv@...lanox.com" <amirv@...lanox.com>,
	Sathya Perla <Sathya.Perla@...lex.Com>
Subject: RE: mlx4+vxlan offload breaks gre tunnels

> -----Original Message-----
> From: netdev-owner@...r.kernel.org [mailto:netdev-
> 
> On 11/5/2014 5:04 PM, Florian Westphal wrote:
> > tl,dr: all tcp packets sent via gre tunnel have broken tcp csum if vxlan
> offload
> > is enabled with mlx4 driver.
> >
> > Given following config on tx-side:
> > dev=enp3s0
> > ip addr add dev $dev 192.168.23.1/24
> > ip link set $dev up
> > ip link add mygre type gretap remote 192.168.23.2 local 192.168.23.1
> > ip addr add dev mygre 192.168.42.1/24
> > ip link set gre0 up
> > ip link set mygre up
> >
> > and
> >
> > options mlx4_core log_num_mgm_entry_size=-1 debug_level=1
> > port_type_array=2,2
> >
> > in
> > /etc/modprobe.d/mlx4.conf
> >
> > all tcp packets sent to destinations over the gre tunnel have bogus tcp
> > checksums (and are tossed on rx side when stack validates tcp checksum).
> >
> > net-next head is commit 30349bdbc4da5ecf0efa25556e3caff9c9b8c5f7 .
> >
> > What makes things work for me:
> > either
> >
> > options mlx4_core 1 debug_level=1 port_type_array=2,2
> >
> > (ie. no MLX4_TUNNEL_OFFLOAD_MODE_VXLAN)
> >
> > or not setting NETIF_F_IP_CSUM in enc_features:
> >
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> > @@ -2579,10 +2579,12 @@ int mlx4_en_init_netdev(struct mlx4_en_dev
> *mdev, int port,
> >                  dev->priv_flags |= IFF_UNICAST_FLT;
> >
> >          if (mdev->dev->caps.tunnel_offload_mode ==
> MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
> > -               dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM
> |
> > +               dev->hw_enc_features |= NETIF_F_RXCSUM |
> >                                          NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL;
> >
> > I am not sure if its right fix, but to my eyes this basically looks like
> > mlx4 is telling stack that it can handle tcp checksum offload within
> > tunnels, and that doesn't seem to be the case for all types (e.g. gre).
> >
> > Could someone who understand the enc_features specifics better confirm
> that
> > above patch is correct (or provide a better/proper fix)?
> 
> Yep, I can see now the problem. It comes into play with ConnectX3-pro
> NICs that support VXLAN offloads (but not with ConnectX3 NIC which
> don't) when you enable the offloads support on the CX3-pro.
> 
> The problem originates from the fact that we can't advertize something
> like "the HW can offload the inner checksum of UDP/VXLAN encapsulated
> (but not for GRE)", e.g in a similar manner that exists in the GSO
> space, where you have NETIF_F_GSO _YYY for each yyy in {UDP, SIT, GRE,
> etc} tunneling scheme.
> 
> I think the best effort we can do now is
> 
> 1. come up with something such as the below patch for 3.18 which is
> back-ward portable for -stable kernels, it will only arm the hw offloads
> if the OS tells us there's VXLAN in action

Or, wouldn't the patch below not work (i.e., the same issue would persist)
when there is both VXLAN and some other (say GRE) tunnel in the system
and the NIC HW is capable of supporting checksum offload only on VxLAN.

Do you expect a user who uses VxLAN to not use other kinds of tunnels?

> 
> 2. come  up with proper kernel APIs to let NICs advertize which encap
> schemes they can actually offload the inner checksum, Tom... your work
> which now runs over netdev.
> 
> Tom/Jesse- thoughts? are you +1-ing the below approach?
> 
> Or.
> 
> tested to work with the  following which is a bit different, tell me if
> it works for you
> 
> # node A - with mlx4_en address192.168.31.18
> ip tunnel add gre1 mode gre local 192.168.31.18 remote 192.168.31.17 ttl 255
> ifconfig gre1 10.10.10.18/24 up
> ifconfig gre1 mtu 1450
> 
> # node B - with mlx4_en address192.168.31.17
> ip tunnel add gre1 mode gre local 192.168.31.17 remote 192.168.31.18 ttl 255
> ifconfig gre1 10.10.10.17/24 up
> ifconfig gre1 mtu 1450
> 
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> index 0efbae9..7753833 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
> @@ -2292,6 +2292,12 @@ static void mlx4_en_add_vxlan_offloads(struct
> work_struct *work)
>   out:
>          if (ret)
>                  en_err(priv, "failed setting L2 tunnel configuration
> ret %d\n", ret);
> +
> +       /* set offloads */
> +       priv->dev->hw_enc_features |= NETIF_F_IP_CSUM |
> NETIF_F_RXCSUM |
> +                                     NETIF_F_TSO | NETIF_F_GSO_UDP_TUNNEL;
> +       priv->dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
> +       priv->dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
>   }
> 
>   static void mlx4_en_del_vxlan_offloads(struct work_struct *work)
> @@ -2299,6 +2305,10 @@ static void mlx4_en_del_vxlan_offloads(struct
> work_struct *work)
>          int ret;
>          struct mlx4_en_priv *priv = container_of(work, struct mlx4_en_priv,
> vxlan_del_task);
> +       /* unset offloads */
> +       priv->dev->hw_enc_features = 0;
> +       priv->dev->hw_features &= ~NETIF_F_GSO_UDP_TUNNEL;
> +       priv->dev->features    &= ~NETIF_F_GSO_UDP_TUNNEL;
> 
>          ret = mlx4_SET_PORT_VXLAN(priv->mdev->dev, priv->port,
>                                    VXLAN_STEER_BY_OUTER_MAC, 0);
> @@ -2578,13 +2588,6 @@ int mlx4_en_init_netdev(struct mlx4_en_dev
> *mdev,
> int port,
>          if (mdev->dev->caps.steering_mode != MLX4_STEERING_MODE_A0)
>                  dev->priv_flags |= IFF_UNICAST_FLT;
> 
> -       if (mdev->dev->caps.tunnel_offload_mode ==
> MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
> -               dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
> -                                       NETIF_F_TSO |
> NETIF_F_GSO_UDP_TUNNEL;
> -               dev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
> -               dev->features    |= NETIF_F_GSO_UDP_TUNNEL;
> -       }
> -
>          mdev->pndev[port] = dev;
> 
>          netif_carrier_off(dev);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ