[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PU1P153MB01699145D8BDE54ABAFF7EF5BFEE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>
Date: Fri, 14 Jun 2019 23:08:39 +0000
From: Dexuan Cui <decui@...rosoft.com>
To: Michael Kelley <mikelley@...rosoft.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Sasha Levin <Alexander.Levin@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
KY Srinivasan <kys@...rosoft.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Tianyu Lan <Tianyu.Lan@...rosoft.com>
CC: "olaf@...fle.de" <olaf@...fle.de>,
"apw@...onical.com" <apw@...onical.com>,
"jasowang@...hat.com" <jasowang@...hat.com>,
vkuznets <vkuznets@...hat.com>,
"marcelo.cerri@...onical.com" <marcelo.cerri@...onical.com>
Subject: RE: [PATCH 2/2] hv_balloon: Reorganize the probe function
> From: Michael Kelley <mikelley@...rosoft.com>
> Sent: Friday, June 14, 2019 2:56 PM
> > ...
> > + ret = balloon_connect_vsp(dev);
> > + if (ret != 0)
> > + return ret;
> > +
> > dm_device.state = DM_INITIALIZED;
> > - last_post_time = jiffies;
>
> I was curious about the above deletion. But I guess the line
> is not needed as the time_after() check in post_status() should
> handle an initial value of 0 for last_post_time just fine.
In a 32-bit kernel, sizeof(unsigned long) is 4, and the global 32-bit
varilable "jiffies" can overflow in 49.7 days if HZ is defined as 1000;
so in theory there is a tiny chance time_after() can not work as
expected here (i.e. we're loading hv_balloon driver when the
"jiffies" is just about to overflow, which is highly unlikely in practice);
even if that happens, we do not care, since the consequence is
just that the memory pressure reporting is delayed by 1 second. :-)
> > +
> > + dm_device.thread =
> > + kthread_run(dm_thread_func, &dm_device, "hv_balloon");
> > + if (IS_ERR(dm_device.thread)) {
> > + ret = PTR_ERR(dm_device.thread);
> > + goto probe_error;
> > + }
>
> Just an observation: this thread creation now happens at the end of the
> probing process. But that's good, because in the old code, the thread
> was started and could run before the protocol version had been
> negotiated. So I'll assume your change here is intentional.
Yes, this is intentional.
> >
> > return 0;
> >
> > -probe_error2:
> > +probe_error:
> > + vmbus_close(dev->channel);
> > #ifdef CONFIG_MEMORY_HOTPLUG
> > + unregister_memory_notifier(&hv_memory_nb);
>
> Hmmm. Evidently the above cleanup was missing in the
> old code.
Yes.
> > restore_online_page_callback(&hv_online_page);
> > #endif
> > - kthread_stop(dm_device.thread);
> > -
> > -probe_error1:
> > - vmbus_close(dev->channel);
> > return ret;
> > }
> >
> > @@ -1734,11 +1742,11 @@ static int balloon_remove(struct hv_device
> *dev)
> > cancel_work_sync(&dm->balloon_wrk.wrk);
> > cancel_work_sync(&dm->ha_wrk.wrk);
> >
> > - vmbus_close(dev->channel);
> > kthread_stop(dm->thread);
> > + vmbus_close(dev->channel);
>
> Presumably this is an intentional ordering change as well.
> The kthread should be stopped before closing the channel.
Yes. The old code is buggy: after the vmbus_close(), there is
a small window in which the old code can still try to send
messages to the host via a freed ringbuffer, causing panic.
> > #ifdef CONFIG_MEMORY_HOTPLUG
> > - restore_online_page_callback(&hv_online_page);
> > unregister_memory_notifier(&hv_memory_nb);
> > + restore_online_page_callback(&hv_online_page);
>
> And you've changed the ordering of these steps so they are
> the inverse of when they are set up. Also a good cleanup ....
Yes. The change is not really necessary, but let's just do it
in a better manner.
>
> Reviewed-by: Michael Kelley <mikelley@...rosoft.com>
Thaks for the detailed comments!
Thanks,
-- Dexuan
Powered by blists - more mailing lists