[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <022609e4-9f30-4e8b-b26b-023cf58adf21@default>
Date: Mon, 21 Dec 2009 15:46:28 -0800 (PST)
From: Dan Magenheimer <dan.magenheimer@...cle.com>
To: ngupta@...are.org
Cc: Nick Piggin <npiggin@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>, jeremy@...p.org,
xen-devel@...ts.xensource.com, tmem-devel@....oracle.com,
Rusty Russell <rusty@...tcorp.com.au>,
Rik van Riel <riel@...hat.com>, dave.mccracken@...cle.com,
Rusty@...inet15.oracle.com, sunil.mushran@...cle.com,
Avi Kivity <avi@...hat.com>,
Schwidefsky <schwidefsky@...ibm.com>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
Alan Cox <alan@...rguk.ukuu.org.uk>, chris.mason@...cle.com,
Pavel Machek <pavel@....cz>, linux-mm <linux-mm@...ck.org>,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: RE: Tmem [PATCH 0/5] (Take 3): Transcendent memory
> From: Nitin Gupta [mailto:ngupta@...are.org]
> Hi Dan,
Hi Nitin --
Thanks for your review!
> (I'm not sure if gmane.org interface sends mail to everyone
> in CC list, so
> sending again. Sorry if you are getting duplicate mail).
FWIW, I only got this one copy (at least so far)!
> I really like the idea of allocating cache memory from
> hypervisor directly. This
> is much more flexible than assigning fixed size memory to guests.
Thanks!
> I think 'frontswap' part seriously overlaps the functionality
> provided by 'ramzswap'
Could be, but I suspect there's a subtle difference.
A key part of the tmem frontswap api is that any
"put" at any time can be rejected. There's no way
for the kernel to know a priori whether the put
will be rejected or not, and the kernel must be able
to react by writing the page to a "true" swap device
and must keep track of which pages were put
to tmem frontswap and which were written to disk.
As a result, tmem frontswap cannot be configured or
used as a true swap "device".
This is critical to acheive the flexibility you
commented above that you like. Only the hypervisor
knows if a free page is available "now" because
it is flexibly managing tmem requests from multiple
guest kernels.
If my understanding of ramzswap is incorrect or you
have some clever solution that I misunderstood,
please let me know.
>> Cleancache is
> > "ephemeral" so whether a page is kept in cleancache
> (between the "put" and
> > the "get") is dependent on a number of factors that are invisible to
> > the kernel.
>
> Just an idea: as an alternate approach, we can create an
> 'in-memory compressed
> storage' backend for FS-Cache. This way, all filesystems
> modified to use
> fs-cache can benefit from this backend. To make it
> virtualization friendly like
> tmem, we can again provide (per-cache?) option to allocate
> from hypervisor i.e.
> tmem_{put,get}_page() or use [compress]+alloc natively.
I looked at FS-Cache and cachefiles and thought I understood
that it is not restricted to clean pages only, thus
not a good match for tmem cleancache.
Again, if I'm wrong (or if it is easy to tell FS-Cache that
pages may "disappear" underneath it), let me know.
BTW, pages put to tmem (both frontswap and cleancache) can
be optionally compressed.
> For guest<-->hypervisor interface, maybe we can use virtio so that all
> hypervisors can benefit? Not quite sure about this one.
I'm not very familiar with virtio, but the existence of "I/O"
in the name concerns me because tmem is entirely synchronous.
Also, tmem is well-layered so very little work needs to be
done on the Linux side for other hypervisors to benefit.
Of course these other hypervisors would need to implement
the hypervisor-side of tmem as well, but there is a well-defined
API to guide other hypervisor-side implementations... and the
opensource tmem code in Xen has a clear split between the
hypervisor-dependent and hypervisor-independent code, which
should simplify implementation for other opensource hypervisors.
I realize in "Take 3" I didn't provide the URL for more information:
http://oss.oracle.com/projects/tmem
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists