lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Mar 2017 21:57:58 -0400
From:   Boris Ostrovsky <boris.ostrovsky@...cle.com>
To:     Dan Streetman <dan.streetman@...onical.com>
Cc:     Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        Juergen Gross <jgross@...e.com>,
        xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org
Subject: Re: maybe revert commit c275a57f5ec3 "xen/balloon: Set balloon's
 initial state to number of existing RAM pages"



On 03/27/2017 03:57 PM, Dan Streetman wrote:
> On Fri, Mar 24, 2017 at 9:33 PM, Boris Ostrovsky
> <boris.ostrovsky@...cle.com> wrote:
>>
>>>
>>> I think we can all agree that the *ideal* situation would be, for the
>>> balloon driver to not immediately hotplug memory so it can add 11 more
>>> pages, so maybe I just need to figure out why the balloon driver
>>> thinks it needs 11 more pages, and fix that.
>>
>>
>>
>> How does the new memory appear in the guest? Via online_pages()?
>>
>> Or is ballooning triggered from watch_target()?
>
> yes, it's triggered from watch_target() which then calls
> online_pages() with the new memory.  I added some debug (all numbers
> are in hex):
>
> [    0.500080] xen:balloon: Initialising balloon driver
> [    0.503027] xen:balloon: balloon_init: current/target pages 1fff9d
> [    0.504044] xen_balloon: Initialising balloon driver
> [    0.508046] xen_balloon: watch_target: new target 800000 kb
> [    0.508046] xen:balloon: balloon_set_new_target: target 200000
> [    0.524024] xen:balloon: current_credit: target pages 200000
> current pages 1fff9d credit 63
> [    0.567055] xen:balloon: balloon_process: current_credit 63
> [    0.568005] xen:balloon: reserve_additional_memory: adding memory
> resource for 8000 pages
> [    3.694443] online_pages: pfn 210000 nr_pages 8000 type 0
> [    3.701072] xen:balloon: current_credit: target pages 200000
> current pages 1fff9d credit 63
> [    3.701074] xen:balloon: balloon_process: current_credit 63
> [    3.701075] xen:balloon: increase_reservation: nr_pages 63
> [    3.701170] xen:balloon: increase_reservation: done, current_pages 1fffa8
> [    3.701172] xen:balloon: current_credit: target pages 200000
> current pages 1fffa8 credit 58
> [    3.701173] xen:balloon: balloon_process: current_credit 58
> [    3.701173] xen:balloon: increase_reservation: nr_pages 58
> [    3.701180] xen:balloon: increase_reservation: XENMEM_populate_physmap err 0
> [    5.708085] xen:balloon: current_credit: target pages 200000
> current pages 1fffa8 credit 58
> [    5.708088] xen:balloon: balloon_process: current_credit 58
> [    5.708089] xen:balloon: increase_reservation: nr_pages 58
> [    5.708106] xen:balloon: increase_reservation: XENMEM_populate_physmap err 0
> [    9.716065] xen:balloon: current_credit: target pages 200000
> current pages 1fffa8 credit 58
> [    9.716068] xen:balloon: balloon_process: current_credit 58
> [    9.716069] xen:balloon: increase_reservation: nr_pages 58
> [    9.716087] xen:balloon: increase_reservation: XENMEM_populate_physmap err 0
>
>
> and that continues forever at the max interval (32), since
> max_retry_count is unlimited.  So I think I understand things now;
> first, the current_pages is set properly based on the e820 map:
>
> $ dmesg|grep -i e820
> [    0.000000] e820: BIOS-provided physical RAM map:
> [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009dfff] usable
> [    0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] reserved
> [    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000efffffff] usable
> [    0.000000] BIOS-e820: [mem 0x00000000fc000000-0x00000000ffffffff] reserved
> [    0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000020fffffff] usable
> [    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
> [    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
> [    0.000000] e820: last_pfn = 0x210000 max_arch_pfn = 0x400000000
> [    0.000000] e820: last_pfn = 0xf0000 max_arch_pfn = 0x400000000
> [    0.000000] e820: [mem 0xf0000000-0xfbffffff] available for PCI devices
> [    0.528007] e820: reserve RAM buffer [mem 0x0009e000-0x0009ffff]
> ubuntu@...172-31-60-112:~$ printf "%x\n" $[ 0x210000 - 0x100000 +
> 0xf0000 - 0x100 + 0x9e - 1 ]
> 1fff9d
>
>
> then, the xen balloon notices its target has been set to 200000 by the
> hypervisor.  That target does account for the hole at 0xf0000 to
> 0x100000, but it doesn't account for the hole at 0xe0 to 0x100 ( 0x20
> pages), nor the hole at 0x9e to 0xa0 ( 2 pages ), nor the unlisted
> hole (that the kernel removes) at 0xa0 to 0xe0 ( 0x40 pages).  That's
> 0x62 pages, plus the 1-page hole at addr 0 that the kernel always
> reserves, is 0x63 pages of holes, which aren't accounted for in the
> hypervisor's target.
>
> so the balloon driver hotplugs the memory, and tries to increase its
> reservation to provide the needed pages to get the current_pages up to
> the target.  However, when it calls the hypervisor to populate the
> physmap, the hypervisor only allows 11 (0xb) pages to be populated;
> all calls after that get back 0 from the hypervisor.
>
> Do you think the hypervisor's balloon target should account for the
> e820 holes (and for the kernel's added hole at addr 0)?
> Alternately/additionally, if the hypervisor doesn't want to support
> ballooning, should it just return error from the call to populate the
> physmap, and not allow those 11 pages?
>
> At this point, it doesn't seem to me like the kernel is doing anything
> wrong, correct?
>


I think there is indeed a disconnect between target memory (provided by 
the toolstack) and current memory (i.e actual pages available to the guest).

For example

[    0.000000] BIOS-e820: [mem 0x000000000009e000-0x000000000009ffff] 
reserved
[    0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] 
reserved

are missed in target calculation. The hvmloader marks them as RESERVED 
(in build_e820_table()) but target value is not aware of this action.

And then the same problem repeats when kernel removes 
0x000a0000-0x000fffff chunk.

(BTW, this is all happening before the new 0x8000 pages are onlined, 
which takes places much later and is a separate and what looks to me an 
unrelated event).

-boris

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ