[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A1C45CE.3010807@redhat.com>
Date: Tue, 26 May 2009 22:41:02 +0300
From: Avi Kivity <avi@...hat.com>
To: Dan Magenheimer <dan.magenheimer@...cle.com>
CC: George Dunlap <George.Dunlap@...citrix.com>,
Jeremy Fitzhardinge <jeremy@...p.org>,
Xen-devel <xen-devel@...ts.xensource.com>,
the arch/x86 maintainers <x86@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Keir Fraser <keir.fraser@...citrix.com>,
Ingo Molnar <mingo@...e.hu>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: [Xen-devel] Re: [GIT PULL] Xen APIC hooks (with io_apic_ops)
Dan Magenheimer wrote:
>> It will also be
>> interesting to see how far Xen can get along without real memory
>> management (overcommit).
>>
>
> Several implementations of "classic" memory overcommit have been
> done for Xen, most recently the Difference Engine work at UCSD.
> It is true that none have been merged yet, in part because,
> in many real world environments, "generalized" overcommit
> often leads to hypervisor swapping, and performance becomes
> unacceptable. (In other words, except in certain limited customer
> use models, memory overcommit is a "marketing feature".)
>
Swapping indeed drags performance down horribly. I regard it as a last
resort solution used when everything else (page sharing, compression,
ballooning, live migration) has failed. By having that last resort you
can actually use the other methods without fearing an out-of-memory
condition eventually.
Note that with SSDs disks have started to narrow the gap between memory
and secondary storage access times, so swapping will actually start
improving rather than regressing as it has done in recent times.
> There's also a novel approach, Transcendent Memory (aka "tmem"
> see http://oss.oracle.com/projects/tmem). Though tmem requires the
> guest to participate in memory management decisions (thus requiring
> a Linux patch), system-wide physical memory efficiency may
> improve vs memory deduplication, and hypervisor-based swapping
> is not necessary.
>
Yes, I've seen that. Another tool in the memory management arsenal.
>
>> The Linux scheduler already supports multiple scheduling
>> classes. If we
>> find that none of them will fit our needs, we'll propose a new one.
>> When the need can be demonstrated to be real, and the
>> implementation can
>> be clean, Linux can usually be adapted.
>>
>
> But that's exactly George and Jeremy's point. KVM will
> eventually require changes that clutter Linux for purposes
> that are relevant only to a hypervisor.
>
kvm has already made changes to Linux. Preemption notifiers allow us to
have a lightweight exit path, and mmu notifiers allow the Linux mmu to
control the kvm mmu. And in fact mmu notifiers have proven useful to
device drivers.
It also works the other way around; for example work on cpu controllers
will benefit kvm, and the real-time scheduler will also apply to kvm
guests. In fact many scheduler and memory management features
immediately apply to kvm, usually without any need for integration.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists