lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151023141738.4db54324@griffin>
Date:	Fri, 23 Oct 2015 14:17:38 +0200
From:	Jiri Benc <jbenc@...hat.com>
To:	Pravin B Shelar <pshelar@...ira.com>
Cc:	netdev@...r.kernel.org
Subject: Re: [PATCH net v3] openvswitch: Fix egress tunnel info.

On Thu, 22 Oct 2015 18:17:16 -0700, Pravin B Shelar wrote:
> While transitioning to netdev based vport we broke OVS
> feature which allows user to retrieve tunnel packet egress
> information for lwtunnel devices.  Following patch fixes it
> by introducing ndo operation to get the tunnel egress info.
> Same ndo operation can be used for lwtunnel devices and compat
> ovs-tnl-vport devices. So after adding such device operation
> we can remove similar operation from ovs-vport.
> 
> Fixes: 614732eaa12d ("openvswitch: Use regular VXLAN net_device device").
> Signed-off-by: Pravin B Shelar <pshelar@...ira.com>
> --
> v2-v3:
> - Remove unused tun_info
> v1-v2:
> - changed ndo operation name to ndo_fill_metadata_dst()
> - Fix geneve stats update

This looks good overall, thanks. I see some issues with the patch but
most of it can be fixed in net-next.git. See below.

[...]
> --- a/drivers/net/vxlan.c
> +++ b/drivers/net/vxlan.c
> @@ -2337,6 +2337,46 @@ static int vxlan_change_mtu(struct net_device *dev, int new_mtu)
>  	return 0;
>  }
>  
> +static int egress_ipv4_tun_info(struct net_device *dev, struct sk_buff *skb,
> +				struct ip_tunnel_info *info,
> +				__be16 sport, __be16 dport)
> +{
> +	struct vxlan_dev *vxlan = netdev_priv(dev);
> +	struct rtable *rt;
> +	struct flowi4 fl4;
> +
> +	memset(&fl4, 0, sizeof(fl4));
> +	fl4.flowi4_tos = RT_TOS(info->key.tos);
> +	fl4.flowi4_mark = skb->mark;
> +	fl4.flowi4_proto = IPPROTO_UDP;
> +	fl4.daddr = info->key.u.ipv4.dst;
> +
> +	rt = ip_route_output_key(vxlan->net, &fl4);
> +	if (IS_ERR(rt))
> +		return PTR_ERR(rt);
> +	ip_rt_put(rt);
> +
> +	info->key.u.ipv4.src = fl4.saddr;
> +	info->key.tp_src = sport;
> +	info->key.tp_dst = dport;
> +	return 0;
> +}

Do you plan to address the introduced code duplication for net-next.git?

> +
> +static int vxlan_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
> +{
> +	struct vxlan_dev *vxlan = netdev_priv(dev);
> +	struct ip_tunnel_info *info = skb_tunnel_info(skb);
> +	__be16 sport, dport;
> +
> +	sport = udp_flow_src_port(dev_net(dev), skb, vxlan->cfg.port_min,
> +				  vxlan->cfg.port_max, true);
> +	dport = info->key.tp_dst ? : vxlan->cfg.dst_port;
> +
> +	if (ip_tunnel_info_af(info) == AF_INET)
> +		return egress_ipv4_tun_info(dev, skb, info, sport, dport);
> +	return -EINVAL;

What about IPv6? There's IPv6 support for metadata based vxlan in
net.git, thus this should have IPv6 support, too. But then, this is
currently used only by ovs which got the IPv6 support only in
net-next.git, thus it may be enough to fix it there.

[...]
> --- a/include/net/dst_metadata.h
> +++ b/include/net/dst_metadata.h
[...]
> +static inline struct ip_tunnel_info *skb_tunnel_info_unclone(struct sk_buff *skb)
> +{
> +	struct metadata_dst *dst;
> +
> +	dst = tun_dst_unclone(skb);
> +	if (IS_ERR(dst))
> +		return NULL;
> +
> +	return &dst->u.tun_info;
> +}

This doesn't do what the name suggests and is, actually, ovs specific.
The ip_tunnel_info can be provided as a part of lwtstate and this
function should handle that case, too. This is not a problem for
net.git, as the function just returns EINVAL in such case, but should
be addressed for net-next.git. As ovs is currently the only user, I'd
be also fine with just a comment stating that, so it's clear for future
users of this function that it needs to be extended before it can be
used out of ovs.

[...]
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -99,6 +99,7 @@
>  #include <linux/rtnetlink.h>
>  #include <linux/stat.h>
>  #include <net/dst.h>
> +#include <net/dst_metadata.h>
>  #include <net/pkt_sched.h>
>  #include <net/checksum.h>
>  #include <net/xfrm.h>
> @@ -682,6 +683,32 @@ int dev_get_iflink(const struct net_device *dev)
>  EXPORT_SYMBOL(dev_get_iflink);
>  
>  /**
> + *	dev_fill_metadata_dst - Retrieve tunnel egress information.
> + *	@dev: targeted interface
> + *	@skb: The packet.
> + *
> + *	For better visibility of tunnel traffic OVS needs to retrieve
> + *	egress tunnel information for a packet. Following API allows
> + *	user to get this info.
> + */
> +int dev_fill_metadata_dst(struct net_device *dev, struct sk_buff *skb)
> +{
> +	struct ip_tunnel_info *info;
> +
> +	if (!dev->netdev_ops  || !dev->netdev_ops->ndo_fill_metadata_dst)
> +		return -EINVAL;
> +
> +	info = skb_tunnel_info_unclone(skb);
> +	if (!info)
> +		return -ENOMEM;

ENOMEM is a wrong error code to return. skb_tunnel_info_unclone should
return the error code returned by tun_dst_unclone, in particular the
EINVAL case which will be much more common than the ENOMEM case.

> +	if (unlikely(!(info->mode & IP_TUNNEL_INFO_TX)))
> +		return -EINVAL;

It would be much better to check the mode before copying the metadata.

[...]
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
[...]
> @@ -749,13 +749,12 @@ static int ipv4_tun_to_nlattr(struct sk_buff *skb,
>  	return 0;
>  }
>  
> -int ovs_nla_put_egress_tunnel_key(struct sk_buff *skb,
> -				  const struct ip_tunnel_info *egress_tun_info,
> -				  const void *egress_tun_opts)
> +int ovs_nla_put_tunnel_info(struct sk_buff *skb,
> +			    struct ip_tunnel_info *tun_info)
>  {
> -	return __ipv4_tun_to_nlattr(skb, &egress_tun_info->key,
> -				    egress_tun_opts,
> -				    egress_tun_info->options_len);
> +	return __ipv4_tun_to_nlattr(skb, &tun_info->key,
> +				    ip_tunnel_info_opts(tun_info),
> +				    tun_info->options_len);
>  }

This should at least check whether the tun_info is indeed IPv4. Actual
IPv6 support for this function can be added to net-next.git.

 Jiri

-- 
Jiri Benc
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ