lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Oct 2021 22:16:59 +0200
From:   Marek Marczykowski-Górecki 
        <marmarek@...isiblethingslab.com>
To:     Juergen Gross <jgross@...e.com>
Cc:     xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Stefano Stabellini <sstabellini@...nel.org>,
        stable@...r.kernel.org
Subject: Re: [PATCH] xen/balloon: add late_initcall_sync() for initial
 ballooning done

On Thu, Oct 28, 2021 at 12:59:52PM +0200, Juergen Gross wrote:
> When running as PVH or HVM guest with actual memory < max memory the
> hypervisor is using "populate on demand" in order to allow the guest
> to balloon down from its maximum memory size. For this to work
> correctly the guest must not touch more memory pages than its target
> memory size as otherwise the PoD cache will be exhausted and the guest
> is crashed as a result of that.
> 
> In extreme cases ballooning down might not be finished today before
> the init process is started, which can consume lots of memory.
> 
> In order to avoid random boot crashes in such cases, add a late init
> call to wait for ballooning down having finished for PVH/HVM guests.
> 
> Cc: <stable@...r.kernel.org>
> Reported-by: Marek Marczykowski-Górecki <marmarek@...isiblethingslab.com>
> Signed-off-by: Juergen Gross <jgross@...e.com>

It may happen that initial balloon down fails (state==BP_ECANCELED). In
that case, it waits indefinitely. I think it should rather report a
failure (and panic? it's similar to OOM before PID 1 starts, so rather
hard to recover), instead of hanging.

Anyway, it does fix the boot crashes.

> ---
>  drivers/xen/balloon.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index 3a50f097ed3e..d19b851c3d3b 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -765,3 +765,23 @@ static int __init balloon_init(void)
>  	return 0;
>  }
>  subsys_initcall(balloon_init);
> +
> +static int __init balloon_wait_finish(void)
> +{
> +	if (!xen_domain())
> +		return -ENODEV;
> +
> +	/* PV guests don't need to wait. */
> +	if (xen_pv_domain() || !current_credit())
> +		return 0;
> +
> +	pr_info("Waiting for initial ballooning down having finished.\n");
> +
> +	while (current_credit())
> +		schedule_timeout_interruptible(HZ / 10);
> +
> +	pr_info("Initial ballooning down finished.\n");
> +
> +	return 0;
> +}
> +late_initcall_sync(balloon_wait_finish);
> -- 
> 2.26.2
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ