[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <560BFCF6.9000203@gmail.com>
Date: Wed, 30 Sep 2015 11:17:10 -0400
From: Austin S Hemmelgarn <ahferroin7@...il.com>
To: Petros Koutoupis <petros@...roskoutoupis.com>,
Christoph Hellwig <hch@...radead.org>
Cc: linux-kernel@...r.kernel.org,
"devel@...iddisk.org" <devel@...iddisk.org>
Subject: Re: [PATCH] Patch to integrate RapidDisk and RapidCache RAM Drive /
Caching modules into the kernel
On 2015-09-30 10:29, Petros Koutoupis wrote:
> Christoph and Austin,
>
> You both have provided me with some valuable feedback. I will do what I
> can to clean this patch up and in turn apply the same dynamic
> functionality to the already in-kernel module. Also please see my
> replies below.
>
> On 9/29/15 9:32 AM, Austin S Hemmelgarn wrote:
>> On 2015-09-28 12:45, Petros Koutoupis wrote:
>>> Christoph,
>>>
>>> See my replies below....
>>>
>>> On 9/28/15 11:29 AM, Christoph Hellwig wrote:
>>>> Hi Petros,
>>>>
>>>> On Mon, Sep 28, 2015 at 09:12:13AM -0500, Petros Koutoupis wrote:
>>>>> 1. Unlike the already mainline ramdisk driver, RapidDisk is designed
>>>>> to be
>>>>> managed dynamically. That is, instead of configuring a fixed number of
>>>>> volumes and volume sizes as compile/boot time variables, RapidDisk
>>>>> will
>>>>> allow you to add, remove, and resize your RAM drive(s) at runtime.
>>>>> Besides,
>>>>> the built in module is designed to work with smaller sizes in mind
>>>>> while
>>>>> RapidDisk focuses on larger sizes that can reach to the multiple
>>>>> Gigabytes
>>>>> or even Terabytes. Much like the built in module, it will allocate
>>>>> pages as
>>>>> they are needed which allows for over provisioning (not that it is
>>>>> advised)
>>>>> of volume sizes.
>>>> The ramdisk driver allows to selects sizes and count at module load
>>>> load. I agree that having runtime control would be even better, but
>>>> that's best done by adding a runtime interface to the existing driver
>>>> instead of duplicating it.
>>> I understand the concern and I will definitely scope out this approach,
>>> although at the moment, I am not sure how both approaches will play nice
>>> together. As mentioned above, the current implementation requires the
>>> predefined number of ram drives with the specified size to be configured
>>> at boot time (or compiled into the kernel). The only wiggle room I see
>>> for runtime control is resizing individual volumes.
>> Just because there is not code currently to do dynamic
>> allocation/freeing of ramdisks in the current driver doesn't mean that
>> it isn't possible, it just means that nobody has written code to do it
>> yet. This functionality would be extremely useful (I often use
>> ramdisks on a VM host as a small amount of very fast swap space for
>> the virtual machines). On top of that, the deduplication would be a
>> wonderful feature, although it may already be indirectly implemented
>> through KSM (that is, when KSM is on and configured to scan
>> everything, I'm not sure if it scans memory used by the ramdisks or not).
>>
> To my understanding KSM is only applied to KVM deployments. One way I
> have seen my caching module work is users/vendors have a block device,
> map it to a RapidDisk RAM drive as a RAM based Write-Through caching
> node and in turn export it via a traditional SAN. The idea behind adding
> deduplication to this module is to minimize the RAM drive footprint when
> used as a block level cache.
KSM is usually used in KVM or other userspace VM deployments, but that
is by no means the only use-case. I actually use it regularly on most
of my systems, and it does help in some cases (for example, I run a lot
of distributed computing apps, often using multiple instances of the
same app, and those don't always share memory to the degree they should,
KSM helps with this).
The write-through caching may be worth looking into, although I think
(not certain about this) that you can force the page cache to do
write-through caching only, except that can only be done globally.
It would probably be better to improve upon the existing pagecache
implementation anyway, ideally, I would love to see:
1. The ability to tell the page cache to claim some minimum amount of
memory that only it can use.
2. The ability to easily tune cache parameters on a per-device (or even
better, per-filesystem) basis.
3. Conversion to a framework that would allow for easy development and
testing of different caching algorithms (although this is probably never
going to happen).
>>>>> 2. The majority of RapidDisk code focuses on the use of Volatile
>>>>> memory.
>>>>> The support for Non-Volatile memory is a bit newer and there may be
>>>>> some
>>>>> overlap here with the recently integrated pmem code. The only
>>>>> advantage to
>>>>> having this code within RapidDisk is to provide the user with the
>>>>> ability
>>>>> to manage both technologies simultaneously, through a single
>>>>> interface.
>>>> Which really doesn't sound like a good enough reason to duplicate it.
>>> I do not disagree with your comment here. This component does not have
>>> to be patched into the mainline.
>>>
>>>>> 3. The RapidCache component is designed around the Non-Volatile
>>>>> functionality of RapidDisk (hence the block-level Write-Through
>>>>> caching).
>>>>> It is also coded and optimized around the RapidDisk sizes/variables,
>>>>> out-of-box. It is worth noting that I am in the process of expanding
>>>>> this
>>>>> module to add deduplication support. This will leverage RapidDisk's
>>>>> ability
>>>>> to allocate pages only when needed and reduce the cache's memory
>>>>> footprint;
>>>>> making more out of less.
>>>> Still needs some code comparism to our existing two caching solutions.
>>>>
>>>> I'd love to see you go ahead with the dynamic ramdisk configuration as
>>>> this is clearly a very useful feature. A caching solution that is
>>>> optimized for non-volatile memory does sound useful, but we'll still
>>>> need a patch better explaining how it actually is as useful as it might
>>>> sound.
>>> CORRECTION: I meant to say Volatile and NOT Non-Volatile. RapidCache is
>>> designed around Volatile memory. I guess I was a little to excited in my
>>> response and I do apologize for that. I will provide a code comparison
>>> in my next e-mail, after I go through the existing RAM drive code.
>> To a certain extent, I see that as potentially less useful than
>> optimized for non-volatile memory. While the current incarnation of
>> the pagecache in Linux could stand to have some serious performance
>> improvements (just think how fast things would be if we used ARC
>> instead of plain LRU), it does still do it's job well for most
>> workloads (although being able to tell the kernel to reserve some
>> portion of memory _just_ for the pagecache would be an interesting and
>> probably very useful feature).
>>
> My only concern with an ARC is CPU utilization. A lot more is required
> to manage two lists.
Actually, most of the CPU time spent in an ARC cache is in the
auto-tuning (the 'adaptive' bit), I've done testing just in userspace
and SLRU (ARC without the adaptive sizing of the lists) uses only a
little more CPU time than traditional LRU, somewhat less than ARC, and
does a much better job of handling COW based workloads. COW is a tough
workload for LRU caching (which is why ZFS uses ARC and not traditional
LRU), as a read-modify-write cycle ends up with the read data not being
needed ever again, which in turn means that MRU caching can be better in
may cases for heavy read-write COW workloads.
Download attachment "smime.p7s" of type "application/pkcs7-signature" (3019 bytes)
Powered by blists - more mailing lists