[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86b114ef-41ea-04b6-327c-4a036f784fad@redhat.com>
Date: Fri, 6 Aug 2021 09:10:28 +0200
From: David Hildenbrand <david@...hat.com>
To: Claudio Imbrenda <imbrenda@...ux.ibm.com>, kvm@...r.kernel.org
Cc: cohuck@...hat.com, borntraeger@...ibm.com, frankja@...ux.ibm.com,
thuth@...hat.com, pasic@...ux.ibm.com, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, Ulrich.Weigand@...ibm.com,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Michal Hocko <mhocko@...nel.org>
Subject: Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy
On 04.08.21 17:40, Claudio Imbrenda wrote:
> Previously, when a protected VM was rebooted or when it was shut down,
> its memory was made unprotected, and then the protected VM itself was
> destroyed. Looping over the whole address space can take some time,
> considering the overhead of the various Ultravisor Calls (UVCs). This
> means that a reboot or a shutdown would take a potentially long amount
> of time, depending on the amount of used memory.
>
> This patchseries implements a deferred destroy mechanism for protected
> guests. When a protected guest is destroyed, its memory is cleared in
> background, allowing the guest to restart or terminate significantly
> faster than before.
>
> There are 2 possibilities when a protected VM is torn down:
> * it still has an address space associated (reboot case)
> * it does not have an address space anymore (shutdown case)
>
> For the reboot case, the reference count of the mm is increased, and
> then a background thread is started to clean up. Once the thread went
> through the whole address space, the protected VM is actually
> destroyed.
That doesn't sound too hacky to me, and actually sounds like a good
idea, doing what the guest would do either way but speeding it up
asynchronously, but ...
>
> For the shutdown case, a list of pages to be destroyed is formed when
> the mm is torn down. Instead of just unmapping the pages when the
> address space is being torn down, they are also set aside. Later when
> KVM cleans up the VM, a thread is started to clean up the pages from
> the list.
... this ...
>
> This means that the same address space can have memory belonging to
> more than one protected guest, although only one will be running, the
> others will in fact not even have any CPUs.
... this ...
>
> When a guest is destroyed, its memory still counts towards its memory
> control group until it's actually freed (I tested this experimentally)
>
> When the system runs out of memory, if a guest has terminated and its
> memory is being cleaned asynchronously, the OOM killer will wait a
> little and then see if memory has been freed. This has the practical
> effect of slowing down memory allocations when the system is out of
> memory to give the cleanup thread time to cleanup and free memory, and
> avoid an actual OOM situation.
... and this sound like the kind of arch MM hacks that will bite us in
the long run. Of course, I might be wrong, but already doing excessive
GFP_ATOMIC allocations or messing with the OOM killer that way for a
pure (shutdown) optimization is an alarm signal. Of course, I might be
wrong.
You should at least CC linux-mm. I'll do that right now and also CC
Michal. He might have time to have a quick glimpse at patch #11 and #13.
https://lkml.kernel.org/r/20210804154046.88552-12-imbrenda@linux.ibm.com
https://lkml.kernel.org/r/20210804154046.88552-14-imbrenda@linux.ibm.com
IMHO, we should proceed with patch 1-10, as they solve a really
important problem ("slow reboots") in a nice way, whereby patch 11
handles a case that can be worked around comparatively easily by
management tools -- my 2 cents.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists