linux-kernel - RE: Tmem [PATCH 0/5] (Take 3): Transcendent memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 21 Dec 2009 15:46:28 -0800 (PST)
From:	Dan Magenheimer <dan.magenheimer@...cle.com>
To:	ngupta@...are.org
Cc:	Nick Piggin <npiggin@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>, jeremy@...p.org,
	xen-devel@...ts.xensource.com, tmem-devel@....oracle.com,
	Rusty Russell <rusty@...tcorp.com.au>,
	Rik van Riel <riel@...hat.com>, dave.mccracken@...cle.com,
	Rusty@...inet15.oracle.com, sunil.mushran@...cle.com,
	Avi Kivity <avi@...hat.com>,
	Schwidefsky <schwidefsky@...ibm.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>, chris.mason@...cle.com,
	Pavel Machek <pavel@....cz>, linux-mm <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: RE: Tmem [PATCH 0/5] (Take 3): Transcendent memory

> From: Nitin Gupta [mailto:ngupta@...are.org]

> Hi Dan,

Hi Nitin --

Thanks for your review!

> (I'm not sure if gmane.org interface sends mail to everyone 
> in CC list, so
> sending again. Sorry if you are getting duplicate mail).

FWIW, I only got this one copy (at least so far)!

> I really like the idea of allocating cache memory from 
> hypervisor directly. This
> is much more flexible than assigning fixed size memory to guests.

Thanks!

> I think 'frontswap' part seriously overlaps the functionality 
> provided by 'ramzswap'

Could be, but I suspect there's a subtle difference.
A key part of the tmem frontswap api is that any
"put" at any time can be rejected.  There's no way
for the kernel to know a priori whether the put
will be rejected or not, and the kernel must be able
to react by writing the page to a "true" swap device
and must keep track of which pages were put
to tmem frontswap and which were written to disk.
As a result, tmem frontswap cannot be configured or
used as a true swap "device".

This is critical to acheive the flexibility you
commented above that you like.  Only the hypervisor
knows if a free page is available "now" because
it is flexibly managing tmem requests from multiple
guest kernels.

If my understanding of ramzswap is incorrect or you
have some clever solution that I misunderstood,
please let me know.

>> Cleancache is
> > "ephemeral" so whether a page is kept in cleancache 
> (between the "put" and
> > the "get") is dependent on a number of factors that are invisible to
> > the kernel.
> 
> Just an idea: as an alternate approach, we can create an 
> 'in-memory compressed
> storage' backend for FS-Cache. This way, all filesystems 
> modified to use
> fs-cache can benefit from this backend. To make it 
> virtualization friendly like
> tmem, we can again provide (per-cache?) option to allocate 
> from hypervisor  i.e.
> tmem_{put,get}_page() or use [compress]+alloc natively.

I looked at FS-Cache and cachefiles and thought I understood
that it is not restricted to clean pages only, thus
not a good match for tmem cleancache.

Again, if I'm wrong (or if it is easy to tell FS-Cache that
pages may "disappear" underneath it), let me know.

BTW, pages put to tmem (both frontswap and cleancache) can
be optionally compressed.

> For guest<-->hypervisor interface, maybe we can use virtio so that all
> hypervisors can benefit? Not quite sure about this one.

I'm not very familiar with virtio, but the existence of "I/O"
in the name concerns me because tmem is entirely synchronous.

Also, tmem is well-layered so very little work needs to be
done on the Linux side for other hypervisors to benefit.
Of course these other hypervisors would need to implement
the hypervisor-side of tmem as well, but there is a well-defined
API to guide other hypervisor-side implementations... and the
opensource tmem code in Xen has a clear split between the
hypervisor-dependent and hypervisor-independent code, which
should simplify implementation for other opensource hypervisors.

I realize in "Take 3" I didn't provide the URL for more information:
http://oss.oracle.com/projects/tmem
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/