lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 22 Apr 2014 19:14:04 +0000
From:	"Christian Benvenuti (benve)" <benve@...co.com>
To:	David Gibson <dgibson@...hat.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
CC:	"Sujith Sankar (ssujith)" <ssujith@...co.com>,
	Govindarajulu Varadarajan <govindarajulu90@...il.com>,
	"Neel Patel (neepatel)" <neepatel@...co.com>,
	Nishank Trivedi <nistrive@...co.com>
Subject: RE: rtnetlink problems with Cisco enic and VFs

David,

> -----Original Message-----
> From: David Gibson [mailto:dgibson@...hat.com]
> Sent: Monday, April 21, 2014 9:14 PM
> To: netdev@...r.kernel.org
> Cc: Christian Benvenuti (benve); Sujith Sankar (ssujith); Govindarajulu
> Varadarajan; Neel Patel (neepatel); Nishank Trivedi
> Subject: RFC: rtnetlink problems with Cisco enic and VFs
> 
> I believe I've found a problem with netlink handling which can be triggered
> on Cisco enic devices with a large number (30-40) of virtual functions.  I
> believe this is the cause of a real customer problem we've seen.
> 
>  * When requesting a list of interfaces with RTM_GETLINK, enic devices
>    (and currently, _only_ enic devices) report IFLA_VF_PORTS
>    information

Is the fact that Enic is the only driver implementing ndo_get_vf_port [1]
the root cause of the problem and the reason why this happens only with Enic?

/Chris

[1]
This is what makes rtnl_port_size to account for vf_port_size*dev_num_vf(...)
and rtnl_port_fill to add IFLA_VF_PORTS.
 
>  * IFLA_VF_PORTS information has at least 90 bytes ber virtual function
> 
>  * Unlike IFLA_VFINFO_LIST, the ports information is always reported,
>    regardless of the setting of the IFLA_EXT_MASK parameter
> 
>  * When IFLA_EXT_MASK is not specified, the reply packets have maximum
>    size NLMSG_GOODSIZE (4k - overheads)
> 
>  * If there are enough virtual functions the IFLA_VF_PORTS information
>    can cause a single interface's info to exceed NLMSG_GOODSIZE
> 
>  * The number of interfaces necessary to trigger this is reduced
>    substantially if both IPv4 and IPv6 IFLA_AF_SPEC information is
>    present (~972 bytes)
> 
>  * If the dump function returns -EMSGSIZE on the first message in a
>    packet, netlink_dump() incorrectly assumes the listing is done,
>    omitting information for that interface and any later ones.
> 
>  * This can cause getifaddrs(3) to go into an infinite loop
> 
>  * 'ip link' is not affected, because it supplies IFLA_EXT_MASK which
>    triggers rtnl_calcit() to recalculate the required packet size to
>    greater than NLMSG_GOODSIZE.
> 
> 
> I can see several possible ways to fix this, but they all have possible
> problems.  I'm hoping someone here can determine which, if any, are real
> problems, and therefore what's the right approach to fix this.
> 
>     A) Always calculate the RTM_NEWLINK packet size, rather than
>        assuming NLMSG_GOODSIZE.
>     Problem: The NLMSG_GOODSIZE limit was introduced to stop broken
>              user tools with limited buffers encountering problems
>              (see 115c9b81928360d769a76c632bae62d15206a94a). This
>              approach might mean that such tools break again.
> 
>     B) Don't issue the VF port information when RTEXT_FILTER_VF isn't
>        set
>     Problem: Do tools using the port information already set this flag?
> 
>     C) Don't include the VF port info when listing interfaces, only
>        when doing GETLINK on a specific interface.
>     Problem: As (B), plus it's ugly.
> 
>     D) Detect the case when the first interface in a packet doesn't fit
>        reallocate the packet buffer
>     Problem: As (A), plus more complicated.
> 
> As an interim band-aid, here's a patch which adds a WARN_ON() in this
> situation, which will at least make the problem easier to locate for the next
> person to encounter it.
> 
> From: David Gibson <david@...son.dropbear.id.au>
> Subject: [PATCH] rtnetlink: Warn when interface's information won't fit
>          in our packet
> 
> Without IFLA_EXT_MASK specified, the information reported for a single
> interface in response to RTM_GETLINK is expected to fit within a netlink
> packet of NLMSG_GOODSIZE.
> 
> If it doesn't, however, things will go badly wrong,  When listing all
> interfaces, netlink_dump() will incorrectly treat -EMSGSIZE on the first
> message in a packet as the end of the listing and omit information for that
> interface and all subsequent ones.  This can cause getifaddrs(3) to enter an
> infinite loop.
> 
> This patch won't fix the problem, but it will WARN_ON() making it easier to
> track down what's going wrong.
> 
> 
> Signed-off-by: David Gibson <david@...son.dropbear.id.au>
> ---
>  net/core/rtnetlink.c | 16 +++++++++++-----
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index
> d4ff417..5331db2 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1198,6 +1198,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb,
> struct netlink_callback *cb) struct hlist_head *head;
>  	struct nlattr *tb[IFLA_MAX+1];
>  	u32 ext_filter_mask = 0;
> +	int err;
> 
>  	s_h = cb->args[0];
>  	s_idx = cb->args[1];
> @@ -1218,11 +1219,16 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb,
> struct netlink_callback *cb) hlist_for_each_entry_rcu(dev, head,
> index_hlist) { if (idx < s_idx)
>  				goto cont;
> -			if (rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
> -					     NETLINK_CB
> (cb->skb).portid,
> -					     cb->nlh->nlmsg_seq, 0,
> -					     NLM_F_MULTI,
> -					     ext_filter_mask) <= 0)
> +			err = rtnl_fill_ifinfo(skb, dev, RTM_NEWLINK,
> +					       NETLINK_CB
> (cb->skb).portid,
> +					       cb->nlh->nlmsg_seq, 0,
> +					       NLM_F_MULTI,
> +					       ext_filter_mask);
> +			/* If we ran out of room on the first message,
> +			 * we're in trouble */
> +			WARN_ON((err == -EMSGSIZE) && (skb->len == 0));
> +
> +			if (err <= 0)
>  				goto out;
> 
>  			nl_dump_check_consistent(cb, nlmsg_hdr(skb));
> --
> 1.9.0
> 
> 
> 
> --
> David Gibson <dgibson@...hat.com>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ