[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877dzj3pyi.fsf@vitty.brq.redhat.com>
Date: Tue, 17 Mar 2020 17:29:09 +0100
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: David Hildenbrand <david@...hat.com>, linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, linuxppc-dev@...ts.ozlabs.org,
linux-hyperv@...r.kernel.org, David Hildenbrand <david@...hat.com>,
"K. Y. Srinivasan" <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...nel.org>,
Oscar Salvador <osalvador@...e.de>,
"Rafael J. Wysocki" <rafael@...nel.org>,
Baoquan He <bhe@...hat.com>,
Wei Yang <richard.weiyang@...il.com>
Subject: Re: [PATCH v2 5/8] hv_balloon: don't check for memhp_auto_online manually
David Hildenbrand <david@...hat.com> writes:
> We get the MEM_ONLINE notifier call if memory is added right from the
> kernel via add_memory() or later from user space.
>
> Let's get rid of the "ha_waiting" flag - the wait event has an inbuilt
> mechanism (->done) for that. Initialize the wait event only once and
> reinitialize before adding memory. Unconditionally call complete() and
> wait_for_completion_timeout().
>
> If there are no waiters, complete() will only increment ->done - which
> will be reset by reinit_completion(). If complete() has already been
> called, wait_for_completion_timeout() will not wait.
>
> There is still the chance for a small race between concurrent
> reinit_completion() and complete(). If complete() wins, we would not
> wait - which is tolerable (and the race exists in current code as
> well).
How can we see concurent reinit_completion() and complete()? Obvioulsy,
we are not onlining new memory in kernel and hv_mem_hot_add() calls are
serialized, we're waiting up to 5*HZ for the added block to come online
before proceeding to the next one. Or do you mean we actually hit this
5*HZ timeout, proceeded to the next block and immediately after
reinit_completion() we saw complete() for the previously added block?
This is tolerable indeed, we're making forward progress (and this all is
'best effort' anyway).
>
> Note: We only wait for "some" memory to get onlined, which seems to be
> good enough for now.
>
> Cc: "K. Y. Srinivasan" <kys@...rosoft.com>
> Cc: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: Stephen Hemminger <sthemmin@...rosoft.com>
> Cc: Wei Liu <wei.liu@...nel.org>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Michal Hocko <mhocko@...nel.org>
> Cc: Oscar Salvador <osalvador@...e.de>
> Cc: "Rafael J. Wysocki" <rafael@...nel.org>
> Cc: Baoquan He <bhe@...hat.com>
> Cc: Wei Yang <richard.weiyang@...il.com>
> Cc: Vitaly Kuznetsov <vkuznets@...hat.com>
> Cc: linux-hyperv@...r.kernel.org
> Signed-off-by: David Hildenbrand <david@...hat.com>
> ---
> drivers/hv/hv_balloon.c | 25 ++++++++++---------------
> 1 file changed, 10 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c
> index a02ce43d778d..af5e09f08130 100644
> --- a/drivers/hv/hv_balloon.c
> +++ b/drivers/hv/hv_balloon.c
> @@ -533,7 +533,6 @@ struct hv_dynmem_device {
> * State to synchronize hot-add.
> */
> struct completion ol_waitevent;
> - bool ha_waiting;
> /*
> * This thread handles hot-add
> * requests from the host as well as notifying
> @@ -634,10 +633,7 @@ static int hv_memory_notifier(struct notifier_block *nb, unsigned long val,
> switch (val) {
> case MEM_ONLINE:
> case MEM_CANCEL_ONLINE:
> - if (dm_device.ha_waiting) {
> - dm_device.ha_waiting = false;
> - complete(&dm_device.ol_waitevent);
> - }
> + complete(&dm_device.ol_waitevent);
> break;
>
> case MEM_OFFLINE:
> @@ -726,8 +722,7 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
> has->covered_end_pfn += processed_pfn;
> spin_unlock_irqrestore(&dm_device.ha_lock, flags);
>
> - init_completion(&dm_device.ol_waitevent);
> - dm_device.ha_waiting = !memhp_auto_online;
> + reinit_completion(&dm_device.ol_waitevent);
>
> nid = memory_add_physaddr_to_nid(PFN_PHYS(start_pfn));
> ret = add_memory(nid, PFN_PHYS((start_pfn)),
> @@ -753,15 +748,14 @@ static void hv_mem_hot_add(unsigned long start, unsigned long size,
> }
>
> /*
> - * Wait for the memory block to be onlined when memory onlining
> - * is done outside of kernel (memhp_auto_online). Since the hot
> - * add has succeeded, it is ok to proceed even if the pages in
> - * the hot added region have not been "onlined" within the
> - * allowed time.
> + * Wait for memory to get onlined. If the kernel onlined the
> + * memory when adding it, this will return directly. Otherwise,
> + * it will wait for user space to online the memory. This helps
> + * to avoid adding memory faster than it is getting onlined. As
> + * adding succeeded, it is ok to proceed even if the memory was
> + * not onlined in time.
> */
> - if (dm_device.ha_waiting)
> - wait_for_completion_timeout(&dm_device.ol_waitevent,
> - 5*HZ);
> + wait_for_completion_timeout(&dm_device.ol_waitevent, 5 * HZ);
> post_status(&dm_device);
> }
> }
> @@ -1707,6 +1701,7 @@ static int balloon_probe(struct hv_device *dev,
> #ifdef CONFIG_MEMORY_HOTPLUG
> set_online_page_callback(&hv_online_page);
> register_memory_notifier(&hv_memory_nb);
> + init_completion(&dm_device.ol_waitevent);
> #endif
>
> hv_set_drvdata(dev, &dm_device);
Reviewed-by: Vitaly Kuznetsov <vkuznets@...hat.com>
--
Vitaly
Powered by blists - more mailing lists