lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 12 Dec 2019 09:10:26 -0500
From:   Nicholas Tsirakis <niko.tsirakis@...il.com>
To:     Jürgen Groß <jgross@...e.com>,
        boris.ostrovsky@...cle.com
Cc:     xen-devel <xen-devel@...ts.xenproject.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [BUG] Xen-ballooned memory never returned to domain after partial-free

> And I think this is the problem. We want here:
>
>     balloon_stats.target_pages = balloon_stats.current_pages +
>                                  balloon_stats.target_unpopulated;

Ahh I knew I was missing something. Tested the patch, works great! "Reported by"
is fine with me.

Do you happen to know the answer to my second question? It's not as important,
but it does confuse me as I wouldn't expect the total memory to be
balloon-able at
all with the hotplugging configs disabled.

--Niko

On Thu, Dec 12, 2019 at 2:18 AM Jürgen Groß <jgross@...e.com> wrote:
>
> On 11.12.19 23:08, Nicholas Tsirakis wrote:
> > Hello,
> >
> > The issue I'm seeing is that pages of previously-xenballooned memory are getting
> > trapped in the balloon on free, specifically when they are free'd in batches
> > (i.e. not all at once). The first batch is restored to the domain properly, but
> > subsequent frees are not.
> >
> > Truthfully I'm not sure if this is a bug or not, but the behavior I'm seeing
> > doesn't seem to make sense. Note that this "bug" is in the balloon driver, but
> > the behavior is seen when using the gnttab API, which utilizes the balloon in
> > the background.
> >
> > ------------------------------------------------------------------------------
> >
> > This issue is better illustrated as an example, seen below. Note that the file
> > in question is drivers/xen/balloon.c:
> >
> > Kernel version: 4.19.*, code seems consistent on master as well
> > Relevant configs:
> >      - CONFIG_MEMORY_HOTPLUG not set
> >      - CONFIG_XEN_BALLOON_MEMORY_HOTPLUG not set
> >
> > * current_pages = # of pages assigned to domain
> > * target_pages = # of pages we want assigned to domain
> > * credit = target - current
> >
> > Start with current_pages/target_pages = 20 pages
> >
> > 1. alloc 5 pages with gnttab_alloc_pages(). current_pages = 15, credit = 5.
> > 2. alloc 3 pages with gnttab_alloc_pages(). current_pages = 12, credit = 8.
> > 3. some time later, free the last 3 pages with gnttab_free_pages().
> > 4. 3 pages go back to balloon and balloon worker is scheduled since credit > 0.
> >      * Relevant part of balloon worker shown below:
> >
> >      do {
> >          ...
> >
> >          credit = current_credit();
> >
> >          if (credit > 0) {
> >              if (balloon_is_inflated())
> >                  state = increase_reservation(credit);
> >              else
> >                  state = reserve_additional_memory();
> >          }
> >
> >          ...
> >
> >      } while (credit && state == BP_DONE);
> >
> > 5. credit > 0 and the balloon contains 3 pages, so run increase_reservation. 3
> >     pages are restored to domain, correctly. current_pages = 15, credit = 5.
> > 6. at this point credit is still > 0, so we loop again.
> > 7. this time, the balloon has 0 pages, so we call reserve_additional_memory,
> >     seen below. note that CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is disabled, so this
> >     funciton is very sparse.
> >
> >      static enum bp_state reserve_additional_memory(void)
> >      {
> >          balloon_stats.target_pages = balloon_stats.current_pages;
> >          return BP_ECANCELED;
> >      }
> >
> > 8. now target = current = 15, which drops our credit down to 0.
>
> And I think this is the problem. We want here:
>
>      balloon_stats.target_pages = balloon_stats.current_pages +
>                                   balloon_stats.target_unpopulated;
>
> This should fix it. Thanks for the detailed analysis!
>
> Does the attached patch work for you?
>
> And are you fine with the "Reported-by:" added?
>
>
> Juergen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ