[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c43723f2acdf257309dca55eac900dc71bca31c3.camel@linux.intel.com>
Date: Fri, 02 Aug 2019 16:15:23 -0700
From: Alexander Duyck <alexander.h.duyck@...ux.intel.com>
To: Nitesh Narayan Lal <nitesh@...hat.com>,
Alexander Duyck <alexander.duyck@...il.com>,
kvm@...r.kernel.org, david@...hat.com, mst@...hat.com,
dave.hansen@...el.com, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, akpm@...ux-foundation.org
Cc: yang.zhang.wz@...il.com, pagupta@...hat.com, riel@...riel.com,
konrad.wilk@...cle.com, willy@...radead.org,
lcapitulino@...hat.com, wei.w.wang@...el.com, aarcange@...hat.com,
pbonzini@...hat.com, dan.j.williams@...el.com
Subject: Re: [PATCH v3 0/6] mm / virtio: Provide support for unused page
reporting
On Fri, 2019-08-02 at 10:28 -0700, Alexander Duyck wrote:
> On Fri, 2019-08-02 at 12:19 -0400, Nitesh Narayan Lal wrote:
> > On 8/2/19 11:13 AM, Alexander Duyck wrote:
> > > On Fri, 2019-08-02 at 10:41 -0400, Nitesh Narayan Lal wrote:
> > > > On 8/1/19 6:24 PM, Alexander Duyck wrote:
> > > > >
<snip>
> > > > > One side effect of these patches is that the guest becomes much more
> > > > > resilient in terms of NUMA locality. With the pages being freed and then
> > > > > reallocated when used it allows for the pages to be much closer to the
> > > > > active thread, and as a result there can be situations where this patch
> > > > > set will out-perform the stock kernel when the guest memory is not local
> > > > > to the guest vCPUs.
> > > > Was this the reason because of which you were seeing better results for
> > > > page_fault1 earlier?
> > > Yes I am thinking so. What I have found is that in the case where the
> > > patches are not applied on the guest it takes a few runs for the numbers
> > > to stabilize. What I think was going on is that I was running memhog to
> > > initially fill the guest and that was placing all the pages on one node or
> > > the other and as such was causing additional variability as the pages were
> > > slowly being migrated over to the other node to rebalance the workload.
> > > One way I tested it was by trying the unpatched case with a direct-
> > > assigned device since that forces it to pin the memory. In that case I was
> > > getting bad results consistently as all the memory was forced to come from
> > > one node during the pre-allocation process.
> > >
> >
> > I have also seen that the page_fault1 values take some time to get stabilize on
> > an unmodified kernel.
> > What I am wondering here is that if on a single NUMA guest doing the following
> > will give the right/better idea or not:
> >
> > 1. Pin the guest to a single NUMA node.
> > 2. Run memhog so that it touches all the guest memory.
> > 3. Run will-it-scale/page_fault1.
> >
> > Compare/observe the values for the last core (this is considering the other core
> > values doesn't drastically differ).
>
> I'll rerun the test with qemu affinitized to one specific socket. It will
> cut the core/thread count down to 8/16 on my test system. Also I will try
> with THP and page shuffling enabled.
Okay so results with 8/16 all affinitized to one socket, THP enabled
page_fault1, and shuffling enabled:
With page reporting disabled in the hypervisor there wasn't much
difference. I saw a range of 0.69% to -1.35% versus baseline, and an
average of 0.16% improvement. So effectively no change.
With page reporting enabled I saw a range of -2.10% to -4.50%, with an
average of -3.05% regression. This is much closer to what I would expect
for this patch set as the page faulting, double zeroing (once in host, and
once in guest), and hinting process itself should have some overhead.
Powered by blists - more mailing lists