[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47B6AB14.5090408@qumranet.com>
Date: Sat, 16 Feb 2008 11:21:24 +0200
From: Avi Kivity <avi@...ranet.com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: Christoph Lameter <clameter@....com>,
Andrea Arcangeli <andrea@...ranet.com>,
Robin Holt <holt@....com>, Izik Eidus <izike@...ranet.com>,
kvm-devel@...ts.sourceforge.net,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
general@...ts.openfabrics.org,
Steve Wise <swise@...ngridcomputing.com>,
Roland Dreier <rdreier@...co.com>,
Kanoj Sarcar <kanojsarcar@...oo.com>, steiner@....com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
daniel.blueman@...drics.com
Subject: Re: [patch 1/6] mmu_notifier: Core code
Andrew Morton wrote:
>> Very. kvm pins pages that are referenced by the guest;
>>
>
> hm. Why does it do that?
>
>
It was deemed best not to allow the guest to write to a page that has
been swapped out and assigned to an unrelated host process.
One way to view the kvm shadow page tables is as hardware dma
descriptors. kvm pins pages for the same reason that drivers pin pages
that are being dma'ed. It's also the reason why mmu notifiers are useful
for such a wide range of dma capable hardware.
>> a 64-bit guest
>> will easily pin its entire memory with the kernel map.
>>
>
>
>> So this is
>> critical for guest swapping to actually work.
>>
>
> Curious. If KVM can release guest pages at the request of this notifier so
> that they can be swapped out, why can't it release them by default, and
> allow swapping to proceed?
>
>
If kvm releases a page, it must also zap any shadow ptes pointing at the
page and flush the tlb. If you do that for all of memory you can't
reference any of it.
Releasing a page has costs, both at the time of the release and when the
guest eventually refers to the page again.
>> Other nice features like page migration are also enabled by this patch.
>>
>>
>
> We already have page migration. Do you mean page-migration-when-using-kvm?
>
Yes, I'm obviously writing from a kvm-centric point of view. This is an
important feature, as the virtualization future seems to be NUMA hosts
(2- or 4- way, 4 cores per socket) running moderately sized guests. The
ability to load-balance guests among the NUMA nodes is important for
performance.
(btw, I'm also looking forward to memory defragmentation. large pages
are important for virtualization workloads and mmu notifiers are again
critical to getting it to work while running kvm).
--
Any sufficiently difficult bug is indistinguishable from a feature.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists