lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <547DE418.9000309@cumulusnetworks.com>
Date:	Tue, 02 Dec 2014 08:08:56 -0800
From:	Roopa Prabhu <roopa@...ulusnetworks.com>
To:	Mahesh Bandewar <maheshb@...gle.com>
CC:	netdev <netdev@...r.kernel.org>,
	David Miller <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Toshiaki Makita <makita.toshiaki@....ntt.co.jp>
Subject: Re: [PATCH net-next] rtnetlink: delay RTM_DELLINK notification until
 after ndo_uninit()

On 12/1/14, 9:54 PM, Mahesh Bandewar wrote:
> The commit 56bfa7ee7c ("unregister_netdevice : move RTM_DELLINK to
> until after ndo_uninit") tried to do this ealier but while doing so
> it created a problem. Unfortunately the delayed rtmsg_ifinfo() also
> delayed call to fill_info(). So this translated into asking driver
> to remove private state and then quert it's private state. This
> could have catastropic consequences.
>
> This change breaks the rtmsg_ifinfo() into two parts - one fills the
> skb by calling fill_info() prior to calling ndo_uninit() and the second
> part just sends the message using the skb filled earlier.
>
> It was brought to notice when last link is deleted from an ipvlan device
> when it has free-ed the port and the subsequent .fill_info() call is
> trying to get the info from the port.
>
> kernel: [  255.139429] ------------[ cut here ]------------
> kernel: [  255.139439] WARNING: CPU: 12 PID: 11173 at net/core/rtnetlink.c:2238 rtmsg_ifinfo+0x100/0x110()
> kernel: [  255.139493] Modules linked in: ipvlan bonding w1_therm ds2482 wire cdc_acm ehci_pci ehci_hcd i2c_dev i2c_i801 i2c_core msr cpuid bnx2x ptp pps_core mdio libcrc32c
> kernel: [  255.139513] CPU: 12 PID: 11173 Comm: ip Not tainted 3.18.0-smp-DEV #167
> kernel: [  255.139514] Hardware name: Intel RML,PCH/Ibis_QC_18, BIOS 1.0.10 05/15/2012
> kernel: [  255.139515]  0000000000000009 ffff880851b6b828 ffffffff815d87f4 00000000000000e0
> kernel: [  255.139516]  0000000000000000 ffff880851b6b868 ffffffff8109c29c 0000000000000000
> kernel: [  255.139518]  00000000ffffffa6 00000000000000d0 ffffffff81aaf580 0000000000000011
> kernel: [  255.139520] Call Trace:
> kernel: [  255.139527]  [<ffffffff815d87f4>] dump_stack+0x46/0x58
> kernel: [  255.139531]  [<ffffffff8109c29c>] warn_slowpath_common+0x8c/0xc0
> kernel: [  255.139540]  [<ffffffff8109c2ea>] warn_slowpath_null+0x1a/0x20
> kernel: [  255.139544]  [<ffffffff8150d570>] rtmsg_ifinfo+0x100/0x110
> kernel: [  255.139547]  [<ffffffff814f78b5>] rollback_registered_many+0x1d5/0x2d0
> kernel: [  255.139549]  [<ffffffff814f79cf>] unregister_netdevice_many+0x1f/0xb0
> kernel: [  255.139551]  [<ffffffff8150acab>] rtnl_dellink+0xbb/0x110
> kernel: [  255.139553]  [<ffffffff8150da90>] rtnetlink_rcv_msg+0xa0/0x240
> kernel: [  255.139557]  [<ffffffff81329283>] ? rhashtable_lookup_compare+0x43/0x80
> kernel: [  255.139558]  [<ffffffff8150d9f0>] ? __rtnl_unlock+0x20/0x20
> kernel: [  255.139562]  [<ffffffff8152cb11>] netlink_rcv_skb+0xb1/0xc0
> kernel: [  255.139563]  [<ffffffff8150a495>] rtnetlink_rcv+0x25/0x40
> kernel: [  255.139565]  [<ffffffff8152c398>] netlink_unicast+0x178/0x230
> kernel: [  255.139567]  [<ffffffff8152c75f>] netlink_sendmsg+0x30f/0x420
> kernel: [  255.139571]  [<ffffffff814e0b0c>] sock_sendmsg+0x9c/0xd0
> kernel: [  255.139575]  [<ffffffff811d1d7f>] ? rw_copy_check_uvector+0x6f/0x130
> kernel: [  255.139577]  [<ffffffff814e11c9>] ? copy_msghdr_from_user+0x139/0x1b0
> kernel: [  255.139578]  [<ffffffff814e1774>] ___sys_sendmsg+0x304/0x310
> kernel: [  255.139581]  [<ffffffff81198723>] ? handle_mm_fault+0xca3/0xde0
> kernel: [  255.139585]  [<ffffffff811ebc4c>] ? destroy_inode+0x3c/0x70
> kernel: [  255.139589]  [<ffffffff8108e6ec>] ? __do_page_fault+0x20c/0x500
> kernel: [  255.139597]  [<ffffffff811e8336>] ? dput+0xb6/0x190
> kernel: [  255.139606]  [<ffffffff811f05f6>] ? mntput+0x26/0x40
> kernel: [  255.139611]  [<ffffffff811d2b94>] ? __fput+0x174/0x1e0
> kernel: [  255.139613]  [<ffffffff814e2129>] __sys_sendmsg+0x49/0x90
> kernel: [  255.139615]  [<ffffffff814e2182>] SyS_sendmsg+0x12/0x20
> kernel: [  255.139617]  [<ffffffff815df092>] system_call_fastpath+0x12/0x17
> kernel: [  255.139619] ---[ end trace 5e6703e87d984f6b ]---

interestingly I have never seen this. We use this heavily with most 
other logical devices.
Which tells me most logical devices do have checks in their fill_info.
The patch idea is good. My only concern is stale information
in the DELLINK notification. Because,  ndo_uninit() does do a lot of 
cleanup, sending
newlink's for some of these cleanup changes. And now with your patch the 
dellink notification
skb probably  contains information that has been already deleted by 
ndo_uninit ?





>
> Signed-off-by: Mahesh Bandewar <maheshb@...gle.com>
> Report-by: Toshiaki Makita <makita.toshiaki@....ntt.co.jp>
> Cc: Eric Dumazet <edumazet@...gle.com>
> Cc: Roopa Prabhu <roopa@...ulusnetworks.com>
> Cc: David S. Miller <davem@...emloft.net>
> ---
>   drivers/net/bonding/bond_main.c |  4 ++--
>   include/linux/rtnetlink.h       |  6 +++++-
>   include/net/bonding.h           |  8 ++++----
>   net/core/dev.c                  | 26 ++++++++++++++++----------
>   net/core/rtnetlink.c            | 20 ++++++++++++++++----
>   5 files changed, 43 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
> index 184c434ae305..06206a1439a4 100644
> --- a/drivers/net/bonding/bond_main.c
> +++ b/drivers/net/bonding/bond_main.c
> @@ -1135,7 +1135,7 @@ static int bond_master_upper_dev_link(struct net_device *bond_dev,
>   	if (err)
>   		return err;
>   	slave_dev->flags |= IFF_SLAVE;
> -	rtmsg_ifinfo(RTM_NEWLINK, slave_dev, IFF_SLAVE, GFP_KERNEL);
> +	rtmsg_ifinfo(RTM_NEWLINK, slave_dev, IFF_SLAVE, GFP_KERNEL, false);
>   	return 0;
>   }
>   
> @@ -1144,7 +1144,7 @@ static void bond_upper_dev_unlink(struct net_device *bond_dev,
>   {
>   	netdev_upper_dev_unlink(slave_dev, bond_dev);
>   	slave_dev->flags &= ~IFF_SLAVE;
> -	rtmsg_ifinfo(RTM_NEWLINK, slave_dev, IFF_SLAVE, GFP_KERNEL);
> +	rtmsg_ifinfo(RTM_NEWLINK, slave_dev, IFF_SLAVE, GFP_KERNEL, false);
>   }
>   
>   static struct slave *bond_alloc_slave(struct bonding *bond)
> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
> index 6cacbce1a06c..545dd0b8c83d 100644
> --- a/include/linux/rtnetlink.h
> +++ b/include/linux/rtnetlink.h
> @@ -16,7 +16,11 @@ extern int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics);
>   extern int rtnl_put_cacheinfo(struct sk_buff *skb, struct dst_entry *dst,
>   			      u32 id, long expires, u32 error);
>   
> -void rtmsg_ifinfo(int type, struct net_device *dev, unsigned change, gfp_t flags);
> +struct sk_buff *rtmsg_ifinfo(int type, struct net_device *dev, unsigned change,
> +			     gfp_t flags, bool fill_only);
> +void rtmsg_ifinfo_send(struct sk_buff *skb, struct net_device *dev,
> +		       gfp_t flags);
> +
>   
>   /* RTNL is used as a global lock for all changes to network configuration  */
>   extern void rtnl_lock(void);
> diff --git a/include/net/bonding.h b/include/net/bonding.h
> index 983a94b86b95..ea09f6c5af51 100644
> --- a/include/net/bonding.h
> +++ b/include/net/bonding.h
> @@ -315,7 +315,7 @@ static inline void bond_set_active_slave(struct slave *slave)
>   {
>   	if (slave->backup) {
>   		slave->backup = 0;
> -		rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_ATOMIC);
> +		rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_ATOMIC, false);
>   	}
>   }
>   
> @@ -323,7 +323,7 @@ static inline void bond_set_backup_slave(struct slave *slave)
>   {
>   	if (!slave->backup) {
>   		slave->backup = 1;
> -		rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_ATOMIC);
> +		rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_ATOMIC, false);
>   	}
>   }
>   
> @@ -335,7 +335,7 @@ static inline void bond_set_slave_state(struct slave *slave,
>   
>   	slave->backup = slave_state;
>   	if (notify) {
> -		rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_ATOMIC);
> +		rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_ATOMIC, false);
>   		slave->should_notify = 0;
>   	} else {
>   		if (slave->should_notify)
> @@ -365,7 +365,7 @@ static inline void bond_slave_state_notify(struct bonding *bond)
>   
>   	bond_for_each_slave(bond, tmp, iter) {
>   		if (tmp->should_notify) {
> -			rtmsg_ifinfo(RTM_NEWLINK, tmp->dev, 0, GFP_ATOMIC);
> +			rtmsg_ifinfo(RTM_NEWLINK, tmp->dev, 0, GFP_ATOMIC, false);
>   			tmp->should_notify = 0;
>   		}
>   	}
> diff --git a/net/core/dev.c b/net/core/dev.c
> index ac4836241a96..29bc78d5e6cb 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1230,7 +1230,7 @@ void netdev_state_change(struct net_device *dev)
>   		change_info.flags_changed = 0;
>   		call_netdevice_notifiers_info(NETDEV_CHANGE, dev,
>   					      &change_info.info);
> -		rtmsg_ifinfo(RTM_NEWLINK, dev, 0, GFP_KERNEL);
> +		rtmsg_ifinfo(RTM_NEWLINK, dev, 0, GFP_KERNEL, false);
>   	}
>   }
>   EXPORT_SYMBOL(netdev_state_change);
> @@ -1319,7 +1319,7 @@ int dev_open(struct net_device *dev)
>   	if (ret < 0)
>   		return ret;
>   
> -	rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_UP|IFF_RUNNING, GFP_KERNEL);
> +	rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_UP|IFF_RUNNING, GFP_KERNEL, false);
>   	call_netdevice_notifiers(NETDEV_UP, dev);
>   
>   	return ret;
> @@ -1396,7 +1396,8 @@ static int dev_close_many(struct list_head *head)
>   	__dev_close_many(head);
>   
>   	list_for_each_entry_safe(dev, tmp, head, close_list) {
> -		rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_UP|IFF_RUNNING, GFP_KERNEL);
> +		rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_UP|IFF_RUNNING,
> +			     GFP_KERNEL, false);
>   		call_netdevice_notifiers(NETDEV_DOWN, dev);
>   		list_del_init(&dev->close_list);
>   	}
> @@ -5683,7 +5684,7 @@ void __dev_notify_flags(struct net_device *dev, unsigned int old_flags,
>   	unsigned int changes = dev->flags ^ old_flags;
>   
>   	if (gchanges)
> -		rtmsg_ifinfo(RTM_NEWLINK, dev, gchanges, GFP_ATOMIC);
> +		rtmsg_ifinfo(RTM_NEWLINK, dev, gchanges, GFP_ATOMIC, false);
>   
>   	if (changes & IFF_UP) {
>   		if (dev->flags & IFF_UP)
> @@ -5925,6 +5926,8 @@ static void rollback_registered_many(struct list_head *head)
>   	synchronize_net();
>   
>   	list_for_each_entry(dev, head, unreg_list) {
> +		struct sk_buff *skb = NULL;
> +
>   		/* Shutdown queueing discipline. */
>   		dev_shutdown(dev);
>   
> @@ -5934,6 +5937,10 @@ static void rollback_registered_many(struct list_head *head)
>   		*/
>   		call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
>   
> +		if (!dev->rtnl_link_ops ||
> +		    dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
> +			skb = rtmsg_ifinfo(RTM_DELLINK, dev, ~0U, GFP_KERNEL, true);
> +
>   		/*
>   		 *	Flush the unicast and multicast chains
>   		 */
> @@ -5943,9 +5950,8 @@ static void rollback_registered_many(struct list_head *head)
>   		if (dev->netdev_ops->ndo_uninit)
>   			dev->netdev_ops->ndo_uninit(dev);
>   
> -		if (!dev->rtnl_link_ops ||
> -		    dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
> -			rtmsg_ifinfo(RTM_DELLINK, dev, ~0U, GFP_KERNEL);
> +		if (skb)
> +			rtmsg_ifinfo_send(skb, dev, GFP_KERNEL);
>   
>   		/* Notifier chain MUST detach us all upper devices. */
>   		WARN_ON(netdev_has_any_upper_dev(dev));
> @@ -6334,7 +6340,7 @@ int register_netdevice(struct net_device *dev)
>   	 */
>   	if (!dev->rtnl_link_ops ||
>   	    dev->rtnl_link_state == RTNL_LINK_INITIALIZED)
> -		rtmsg_ifinfo(RTM_NEWLINK, dev, ~0U, GFP_KERNEL);
> +		rtmsg_ifinfo(RTM_NEWLINK, dev, ~0U, GFP_KERNEL, false);
>   
>   out:
>   	return ret;
> @@ -6959,7 +6965,7 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
>   	call_netdevice_notifiers(NETDEV_UNREGISTER, dev);
>   	rcu_barrier();
>   	call_netdevice_notifiers(NETDEV_UNREGISTER_FINAL, dev);
> -	rtmsg_ifinfo(RTM_DELLINK, dev, ~0U, GFP_KERNEL);
> +	rtmsg_ifinfo(RTM_DELLINK, dev, ~0U, GFP_KERNEL, false);
>   
>   	/*
>   	 *	Flush the unicast and multicast chains
> @@ -7000,7 +7006,7 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char
>   	 *	Prevent userspace races by waiting until the network
>   	 *	device is fully setup before sending notifications.
>   	 */
> -	rtmsg_ifinfo(RTM_NEWLINK, dev, ~0U, GFP_KERNEL);
> +	rtmsg_ifinfo(RTM_NEWLINK, dev, ~0U, GFP_KERNEL, false);
>   
>   	synchronize_net();
>   	err = 0;
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index b9b7dfaf202b..1035d8cdbc08 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -2220,8 +2220,16 @@ static int rtnl_dump_all(struct sk_buff *skb, struct netlink_callback *cb)
>   	return skb->len;
>   }
>   
> -void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change,
> -		  gfp_t flags)
> +void rtmsg_ifinfo_send(struct sk_buff *skb, struct net_device *dev, gfp_t flags)
> +{
> +	struct net *net = dev_net(dev);
> +
> +	rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, flags);
> +}
> +EXPORT_SYMBOL(rtmsg_ifinfo_send);
> +
> +struct sk_buff *rtmsg_ifinfo(int type, struct net_device *dev,
> +			     unsigned int change, gfp_t flags, bool fill_only)
>   {
>   	struct net *net = dev_net(dev);
>   	struct sk_buff *skb;
> @@ -2239,11 +2247,15 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change,
>   		kfree_skb(skb);
>   		goto errout;
>   	}
> +	if (fill_only)
> +	    return skb;
> +
>   	rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, flags);
> -	return;
> +	return NULL;
>   errout:
>   	if (err < 0)
>   		rtnl_set_sk_err(net, RTNLGRP_LINK, err);
> +	return NULL;
>   }
>   EXPORT_SYMBOL(rtmsg_ifinfo);
>   
> @@ -3011,7 +3023,7 @@ static int rtnetlink_event(struct notifier_block *this, unsigned long event, voi
>   	case NETDEV_JOIN:
>   		break;
>   	default:
> -		rtmsg_ifinfo(RTM_NEWLINK, dev, 0, GFP_KERNEL);
> +		rtmsg_ifinfo(RTM_NEWLINK, dev, 0, GFP_KERNEL, false);
>   		break;
>   	}
>   	return NOTIFY_DONE;

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ