[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BD52C4F.40505@redhat.com>
Date: Mon, 26 Apr 2010 09:01:51 +0300
From: Avi Kivity <avi@...hat.com>
To: Dan Magenheimer <dan.magenheimer@...cle.com>
CC: linux-kernel@...r.kernel.org, linux-mm@...ck.org, jeremy@...p.org,
hugh.dickins@...cali.co.uk, ngupta@...are.org, JBeulich@...ell.com,
chris.mason@...cle.com, kurt.hackel@...cle.com,
dave.mccracken@...cle.com, npiggin@...e.de,
akpm@...ux-foundation.org, riel@...hat.com
Subject: Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview
On 04/25/2010 06:29 PM, Dan Magenheimer wrote:
>>> While I admit that I started this whole discussion by implying
>>> that frontswap (and cleancache) might be useful for SSDs, I think
>>> we are going far astray here. Frontswap is synchronous for a
>>> reason: It uses real RAM, but RAM that is not directly addressable
>>> by a (guest) kernel. SSD's (at least today) are still I/O devices;
>>> even though they may be very fast, they still live on a PCI (or
>>> slower) bus and use DMA. Frontswap is not intended for use with
>>> I/O devices.
>>>
>>> Today's memory technologies are either RAM that can be addressed
>>> by the kernel, or I/O devices that sit on an I/O bus. The
>>> exotic memories that I am referring to may be a hybrid:
>>> memory that is fast enough to live on a QPI/hypertransport,
>>> but slow enough that you wouldn't want to randomly mix and
>>> hand out to userland apps some pages from "exotic RAM" and some
>>> pages from "normal RAM". Such memory makes no sense today
>>> because OS's wouldn't know what to do with it. But it MAY
>>> make sense with frontswap (and cleancache).
>>>
>>> Nevertheless, frontswap works great today with a bare-metal
>>> hypervisor. I think it stands on its own merits, regardless
>>> of one's vision of future SSD/memory technologies.
>>>
>> Even when frontswapping to RAM on a bare metal hypervisor it makes
>> sense
>> to use an async API, in case you have a DMA engine on board.
>>
> When pages are 2MB, this may be true. When pages are 4KB and
> copied individually, it may take longer to program a DMA engine
> than to just copy 4KB.
>
Of course, you have to use a batching API, like virtio or Xen's rings,
to avoid the overhead.
> But in any case, frontswap works fine on all existing machines
> today. If/when most commodity CPUs have an asynchronous RAM DMA
> engine, an asynchronous API may be appropriate. Or the existing
> swap API might be appropriate. Or the synchronous frontswap API
> may work fine too. Speculating further about non-existent
> hardware that might exist in the (possibly far) future is irrelevant
> to the proposed patch, which works today on all existing x86 hardware
> and on shipping software.
>
dma engines are present on commodity hardware now:
http://en.wikipedia.org/wiki/I/O_Acceleration_Technology
I don't know if consumer machines have them, but servers certainly do.
modprobe ioatdma.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists