lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 4 Nov 2021 17:21:53 +0100
From:   Juergen Gross <jgross@...e.com>
To:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        xen-devel@...ts.xenproject.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org
Cc:     Jonathan Corbet <corbet@....net>,
        Stefano Stabellini <sstabellini@...nel.org>,
        stable@...r.kernel.org,
        Marek Marczykowski-Górecki 
        <marmarek@...isiblethingslab.com>
Subject: Re: [PATCH v4] xen/balloon: add late_initcall_sync() for initial
 ballooning done

On 04.11.21 16:55, Boris Ostrovsky wrote:
> 
> On 11/3/21 9:55 PM, Boris Ostrovsky wrote:
>>
>> On 11/2/21 5:19 AM, Juergen Gross wrote:
>>> When running as PVH or HVM guest with actual memory < max memory the
>>> hypervisor is using "populate on demand" in order to allow the guest
>>> to balloon down from its maximum memory size. For this to work
>>> correctly the guest must not touch more memory pages than its target
>>> memory size as otherwise the PoD cache will be exhausted and the guest
>>> is crashed as a result of that.
>>>
>>> In extreme cases ballooning down might not be finished today before
>>> the init process is started, which can consume lots of memory.
>>>
>>> In order to avoid random boot crashes in such cases, add a late init
>>> call to wait for ballooning down having finished for PVH/HVM guests.
>>>
>>> Warn on console if initial ballooning fails, panic() after stalling
>>> for more than 3 minutes per default. Add a module parameter for
>>> changing this timeout.
>>>
>>> Cc: <stable@...r.kernel.org>
>>> Reported-by: Marek Marczykowski-Górecki 
>>> <marmarek@...isiblethingslab.com>
>>> Signed-off-by: Juergen Gross <jgross@...e.com>
>>
>>
>>
>> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@...cle.com>
> 
> 
> This appears to have noticeable effect on boot time (and boot experience 
> in general).
> 
> 
> I have
> 
> 
>    memory=1024
>    maxmem=8192
> 
> 
> And my boot time (on an admittedly slow box) went from 33 to 45 seconds. 
> And boot pauses in the middle while it is waiting for ballooning to 
> complete.
> 
> 
> [    5.062714] xen:balloon: Waiting for initial ballooning down having 
> finished.
> [    5.449696] random: crng init done
> [   34.613050] xen:balloon: Initial ballooning down finished.

This shows that before it was just by chance that the PoD cache wasn't
exhausted.

> So at least I think we should consider bumping log level down from info.

Which level would you prefer? warn?

And if so, would you mind doing this while committing (I have one day
off tomorrow)?


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3092 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ