lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM5PR21MB01379C31FC628F4666413804D7A40@DM5PR21MB0137.namprd21.prod.outlook.com>
Date:   Fri, 23 Aug 2019 20:16:45 +0000
From:   Michael Kelley <mikelley@...rosoft.com>
To:     Dexuan Cui <decui@...rosoft.com>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        Sasha Levin <Alexander.Levin@...rosoft.com>,
        "sashal@...nel.org" <sashal@...nel.org>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        KY Srinivasan <kys@...rosoft.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3 11/12] Drivers: hv: vmbus: Suspend after cleaning up
 hv_sock and sub channels

From: Dexuan Cui <decui@...rosoft.com> Sent: Monday, August 19, 2019 6:52 PM
> 
> Before suspend, Linux must make sure all the hv_sock channels have been
> properly cleaned up, because a hv_sock connection can not persist across
> hibernation, and the user-space app must be properly notified of the
> state change of the connection.
> 
> Before suspend, Linux also must make sure all the sub-channels have been
> destroyed, i.e. the related channel structs of the sub-channels must be
> properly removed, otherwise they would cause a conflict when the
> sub-channels are recreated upon resume.
> 
> Add a counter to track such channels, and vmbus_bus_suspend() should wait
> for the counter to drop to zero.
> 
> Signed-off-by: Dexuan Cui <decui@...rosoft.com>
> ---
>  drivers/hv/channel_mgmt.c | 28 ++++++++++++++++++++++++++++
>  drivers/hv/connection.c   |  3 +++
>  drivers/hv/hyperv_vmbus.h | 12 ++++++++++++
>  drivers/hv/vmbus_drv.c    | 41 ++++++++++++++++++++++++++++++++++++++++-
>  4 files changed, 83 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index f7a1184..8491d1b 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -499,6 +499,8 @@ static void vmbus_add_channel_work(struct work_struct *work)
>  	return;
> 
>  err_deq_chan:
> +	WARN_ON_ONCE(1);
> +

Why this change?  I was thinking maybe it's a debug statement that got
left in.

>  	mutex_lock(&vmbus_connection.channel_mutex);
> 
>  	/*
> @@ -545,6 +547,10 @@ static void vmbus_process_offer(struct vmbus_channel
> *newchannel)
> 
>  	mutex_lock(&vmbus_connection.channel_mutex);
> 
> +	/* Remember the channels that should be cleaned up upon suspend. */
> +	if (is_hvsock_channel(newchannel) || is_sub_channel(newchannel))
> +		atomic_inc(&vmbus_connection.nr_chan_close_on_suspend);
> +
>  	/*
>  	 * Now that we have acquired the channel_mutex,
>  	 * we can release the potentially racing rescind thread.
> @@ -915,6 +921,16 @@ static void vmbus_onoffer(struct
> vmbus_channel_message_header *hdr)
>  	vmbus_process_offer(newchannel);
>  }
> 
> +static void check_ready_for_suspend_event(void)
> +{
> +	/*
> +	 * If all the sub-channels or hv_sock channels have been cleaned up,
> +	 * then it's safe to suspend.
> +	 */
> +	if (atomic_dec_and_test(&vmbus_connection.nr_chan_close_on_suspend))
> +		complete(&vmbus_connection.ready_for_suspend_event);
> +}
> +
>  /*
>   * vmbus_onoffer_rescind - Rescind offer handler.
>   *
> @@ -925,6 +941,7 @@ static void vmbus_onoffer_rescind(struct
> vmbus_channel_message_header *hdr)
>  	struct vmbus_channel_rescind_offer *rescind;
>  	struct vmbus_channel *channel;
>  	struct device *dev;
> +	bool clean_up_chan_for_suspend;
> 
>  	rescind = (struct vmbus_channel_rescind_offer *)hdr;
> 
> @@ -964,6 +981,8 @@ static void vmbus_onoffer_rescind(struct
> vmbus_channel_message_header *hdr)
>  		return;
>  	}
> 
> +	clean_up_chan_for_suspend = is_hvsock_channel(channel) ||
> +				    is_sub_channel(channel);
>  	/*
>  	 * Before setting channel->rescind in vmbus_rescind_cleanup(), we
>  	 * should make sure the channel callback is not running any more.
> @@ -989,6 +1008,10 @@ static void vmbus_onoffer_rescind(struct
> vmbus_channel_message_header *hdr)
>  	if (channel->device_obj) {
>  		if (channel->chn_rescind_callback) {
>  			channel->chn_rescind_callback(channel);
> +
> +			if (clean_up_chan_for_suspend)
> +				check_ready_for_suspend_event();
> +
>  			return;
>  		}
>  		/*
> @@ -1021,6 +1044,11 @@ static void vmbus_onoffer_rescind(struct
> vmbus_channel_message_header *hdr)
>  		}
>  		mutex_unlock(&vmbus_connection.channel_mutex);
>  	}
> +
> +	/* The "channel" may have been freed. Do not access it any longer. */
> +
> +	if (clean_up_chan_for_suspend)
> +		check_ready_for_suspend_event();
>  }

Having to add the above lines twice is a bit clumsy, but the problem is
the overall structure of the vmbus_onoffer_rescind.  The early return in
the case of a rescind_callback function is a bit weird, but I guess it makes
sense since from what I can tell, only uio and hv_sock have rescind callback
functions.   Some minor restructuring might be warranted, but I don't feel
strongly about it.

> 
>  void vmbus_hvsock_device_unregister(struct vmbus_channel *channel)
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index 701d9a8..f15d3115 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -26,6 +26,9 @@
>  struct vmbus_connection vmbus_connection = {
>  	.conn_state		= DISCONNECTED,
>  	.next_gpadl_handle	= ATOMIC_INIT(0xE1E10),
> +
> +	.ready_for_suspend_event= COMPLETION_INITIALIZER(
> +				  vmbus_connection.ready_for_suspend_event),
>  };
>  EXPORT_SYMBOL_GPL(vmbus_connection);
> 
> diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> index 4610277..9f96e23 100644
> --- a/drivers/hv/hyperv_vmbus.h
> +++ b/drivers/hv/hyperv_vmbus.h
> @@ -258,6 +258,18 @@ struct vmbus_connection {
>  	struct workqueue_struct *work_queue;
>  	struct workqueue_struct *handle_primary_chan_wq;
>  	struct workqueue_struct *handle_sub_chan_wq;
> +
> +	/*
> +	 * The number of sub-channels and hv_sock channels that should be
> +	 * cleaned up upon suspend: sub-channels will be re-created upon
> +	 * resume, and hv_sock channels should not survive suspend.
> +	 */
> +	atomic_t nr_chan_close_on_suspend;
> +	/*
> +	 * vmbus_bus_suspend() waits for "nr_chan_close_on_suspend" to
> +	 * drop to zero.
> +	 */
> +	struct completion ready_for_suspend_event;
>  };
> 
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 2bea669..0507157 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2127,7 +2127,8 @@ static int vmbus_acpi_add(struct acpi_device *device)
> 
>  static int vmbus_bus_suspend(struct device *dev)
>  {
> -	struct vmbus_channel *channel;
> +	struct vmbus_channel *channel, *sc;
> +	unsigned long flags;
> 
>  	while (atomic_read(&vmbus_connection.offer_in_progress) != 0) {
>  		/*
> @@ -2146,6 +2147,41 @@ static int vmbus_bus_suspend(struct device *dev)
>  	}
>  	mutex_unlock(&vmbus_connection.channel_mutex);
> 
> +	/*
> +	 * Wait until all the sub-channels and hv_sock channels have been
> +	 * cleaned up. Sub-channels should be destroyed upon suspend, otherwise
> +	 * they would conflict with the new sub-channels that will be created
> +	 * in the resume path. hv_sock channels should also be destroyed, but
> +	 * a hv_sock channel of an established hv_sock connection can not be
> +	 * really destroyed since it may still be referenced by the userspace
> +	 * application, so we just force the hv_sock channel to be rescinded
> +	 * by vmbus_force_channel_rescinded(), and the userspace application
> +	 * will thoroughly destroy the channel after hibernation.
> +	 */
> +	if (atomic_read(&vmbus_connection.nr_chan_close_on_suspend) > 0)

At first glance, the above line seemed useless to me.  You could just do the
wait_for_completion() unconditionally.  But is the intent to handle the case where
the VM never had any sub-channels or hv_sock channels, and so
nr_chan_close_on_suspend never went above 0?  If so, a comment might
be helpful.

> +		wait_for_completion(&vmbus_connection.ready_for_suspend_event);
> +
> +	mutex_lock(&vmbus_connection.channel_mutex);
> +
> +	list_for_each_entry(channel, &vmbus_connection.chn_list, listentry) {
> +		if (is_hvsock_channel(channel)) {
> +			if (!channel->rescind) {
> +				pr_err("hv_sock channel not rescinded!\n");
> +				WARN_ON_ONCE(1);
> +			}
> +			continue;
> +		}
> +
> +		spin_lock_irqsave(&channel->lock, flags);
> +		list_for_each_entry(sc, &channel->sc_list, sc_list) {
> +			pr_err("Sub-channel not deleted!\n");
> +			WARN_ON_ONCE(1);
> +		}
> +		spin_unlock_irqrestore(&channel->lock, flags);
> +	}
> +
> +	mutex_unlock(&vmbus_connection.channel_mutex);
> +
>  	vmbus_initiate_unload(false);
> 
>  	vmbus_connection.conn_state = DISCONNECTED;
> @@ -2186,6 +2222,9 @@ static int vmbus_bus_resume(struct device *dev)
> 
>  	vmbus_request_offers();
> 
> +	/* Reset the event for the next suspend. */
> +	reinit_completion(&vmbus_connection.ready_for_suspend_event);
> +
>  	return 0;
>  }
> 
> --
> 1.8.3.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ