[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YXsFO2TMRiJTQM2q@mail-itl>
Date: Thu, 28 Oct 2021 22:16:59 +0200
From: Marek Marczykowski-Górecki
<marmarek@...isiblethingslab.com>
To: Juergen Gross <jgross@...e.com>
Cc: xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Stefano Stabellini <sstabellini@...nel.org>,
stable@...r.kernel.org
Subject: Re: [PATCH] xen/balloon: add late_initcall_sync() for initial
ballooning done
On Thu, Oct 28, 2021 at 12:59:52PM +0200, Juergen Gross wrote:
> When running as PVH or HVM guest with actual memory < max memory the
> hypervisor is using "populate on demand" in order to allow the guest
> to balloon down from its maximum memory size. For this to work
> correctly the guest must not touch more memory pages than its target
> memory size as otherwise the PoD cache will be exhausted and the guest
> is crashed as a result of that.
>
> In extreme cases ballooning down might not be finished today before
> the init process is started, which can consume lots of memory.
>
> In order to avoid random boot crashes in such cases, add a late init
> call to wait for ballooning down having finished for PVH/HVM guests.
>
> Cc: <stable@...r.kernel.org>
> Reported-by: Marek Marczykowski-Górecki <marmarek@...isiblethingslab.com>
> Signed-off-by: Juergen Gross <jgross@...e.com>
It may happen that initial balloon down fails (state==BP_ECANCELED). In
that case, it waits indefinitely. I think it should rather report a
failure (and panic? it's similar to OOM before PID 1 starts, so rather
hard to recover), instead of hanging.
Anyway, it does fix the boot crashes.
> ---
> drivers/xen/balloon.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 3a50f097ed3e..d19b851c3d3b 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -765,3 +765,23 @@ static int __init balloon_init(void)
> return 0;
> }
> subsys_initcall(balloon_init);
> +
> +static int __init balloon_wait_finish(void)
> +{
> + if (!xen_domain())
> + return -ENODEV;
> +
> + /* PV guests don't need to wait. */
> + if (xen_pv_domain() || !current_credit())
> + return 0;
> +
> + pr_info("Waiting for initial ballooning down having finished.\n");
> +
> + while (current_credit())
> + schedule_timeout_interruptible(HZ / 10);
> +
> + pr_info("Initial ballooning down finished.\n");
> +
> + return 0;
> +}
> +late_initcall_sync(balloon_wait_finish);
> --
> 2.26.2
>
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)
Powered by blists - more mailing lists