linux-kernel - Re: [PATCH] Patch to integrate RapidDisk and RapidCache RAM Drive / Caching modules into the kernel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <560BFCF6.9000203@gmail.com>
Date:	Wed, 30 Sep 2015 11:17:10 -0400
From:	Austin S Hemmelgarn <ahferroin7@...il.com>
To:	Petros Koutoupis <petros@...roskoutoupis.com>,
	Christoph Hellwig <hch@...radead.org>
Cc:	linux-kernel@...r.kernel.org,
	"devel@...iddisk.org" <devel@...iddisk.org>
Subject: Re: [PATCH] Patch to integrate RapidDisk and RapidCache RAM Drive /
 Caching modules into the kernel

On 2015-09-30 10:29, Petros Koutoupis wrote:
> Christoph and Austin,
>
> You both have provided me with some valuable feedback. I will do what I
> can to clean this patch up and in turn apply the same dynamic
> functionality to the already in-kernel module. Also please see my
> replies below.
>
> On 9/29/15 9:32 AM, Austin S Hemmelgarn wrote:
>> On 2015-09-28 12:45, Petros Koutoupis wrote:
>>> Christoph,
>>>
>>> See my replies below....
>>>
>>> On 9/28/15 11:29 AM, Christoph Hellwig wrote:
>>>> Hi Petros,
>>>>
>>>> On Mon, Sep 28, 2015 at 09:12:13AM -0500, Petros Koutoupis wrote:
>>>>> 1.  Unlike the already mainline ramdisk driver, RapidDisk is designed
>>>>> to be
>>>>> managed dynamically. That is, instead of configuring a fixed number of
>>>>> volumes and volume sizes as compile/boot time variables, RapidDisk
>>>>> will
>>>>> allow you to add, remove, and resize your RAM drive(s) at runtime.
>>>>> Besides,
>>>>> the built in module is designed to work with smaller sizes in mind
>>>>> while
>>>>> RapidDisk focuses on larger sizes that can reach to the multiple
>>>>> Gigabytes
>>>>> or even Terabytes. Much like the built in module, it will allocate
>>>>> pages as
>>>>> they are needed which allows for over provisioning (not that it is
>>>>> advised)
>>>>> of volume sizes.
>>>> The ramdisk driver allows to selects sizes and count at module load
>>>> load.  I agree that having runtime control would be even better, but
>>>> that's best done by adding a runtime interface to the existing driver
>>>> instead of duplicating it.
>>> I understand the concern and I will definitely scope out this approach,
>>> although at the moment, I am not sure how both approaches will play nice
>>> together. As mentioned above, the current implementation requires the
>>> predefined number of ram drives with the specified size to be configured
>>> at boot time (or compiled into the kernel). The only wiggle room I see
>>> for runtime control is resizing individual volumes.
>> Just because there is not code currently to do dynamic
>> allocation/freeing of ramdisks in the current driver doesn't mean that
>> it isn't possible, it just means that nobody has written code to do it
>> yet.  This functionality would be extremely useful (I often use
>> ramdisks on a VM host as a small amount of very fast swap space for
>> the virtual machines).  On top of that, the deduplication would be a
>> wonderful feature, although it may already be indirectly implemented
>> through KSM (that is, when KSM is on and configured to scan
>> everything, I'm not sure if it scans memory used by the ramdisks or not).
>>
> To my understanding KSM is only applied to KVM deployments. One way I
> have seen my caching module work is users/vendors have a block device,
> map it to a RapidDisk RAM drive as a RAM based Write-Through caching
> node and in turn export it via a traditional SAN. The idea behind adding
> deduplication to this module is to minimize the RAM drive footprint when
> used as a block level cache.
KSM is usually used in KVM or other userspace VM deployments, but that 
is by no means the only use-case.  I actually use it regularly on most 
of my systems, and it does help in some cases (for example, I run a lot 
of distributed computing apps, often using multiple instances of the 
same app, and those don't always share memory to the degree they should, 
KSM helps with this).

The write-through caching may be worth looking into, although I think 
(not certain about this) that you can force the page cache to do 
write-through caching only, except that can only be done globally.

It would probably be better to improve upon the existing pagecache 
implementation anyway, ideally, I would love to see:
1. The ability to tell the page cache to claim some minimum amount of 
memory that only it can use.
2. The ability to easily tune cache parameters on a per-device (or even 
better, per-filesystem) basis.
3. Conversion to a framework that would allow for easy development and 
testing of different caching algorithms (although this is probably never 
going to happen).
>>>>> 2. The majority of RapidDisk code focuses on the use of Volatile
>>>>> memory.
>>>>> The support for Non-Volatile memory is a bit newer and there may be
>>>>> some
>>>>> overlap here with the recently integrated pmem code. The only
>>>>> advantage to
>>>>> having this code within RapidDisk is to provide the user with the
>>>>> ability
>>>>> to manage both technologies simultaneously, through a single
>>>>> interface.
>>>> Which really doesn't sound like a good enough reason to duplicate it.
>>> I do not disagree with your comment here. This component does not have
>>> to be patched into the mainline.
>>>
>>>>> 3. The RapidCache component is designed around the Non-Volatile
>>>>> functionality of RapidDisk (hence the block-level Write-Through
>>>>> caching).
>>>>> It is also coded and optimized around the RapidDisk sizes/variables,
>>>>> out-of-box. It is worth noting that I am in the process of expanding
>>>>> this
>>>>> module to add deduplication support. This will leverage RapidDisk's
>>>>> ability
>>>>> to allocate pages only when needed and reduce the cache's memory
>>>>> footprint;
>>>>> making more out of less.
>>>> Still needs some code comparism to our existing two caching solutions.
>>>>
>>>> I'd love to see you go ahead with the dynamic ramdisk configuration as
>>>> this is clearly a very useful feature.  A caching solution that is
>>>> optimized for non-volatile memory does sound useful, but we'll still
>>>> need a patch better explaining how it actually is as useful as it might
>>>> sound.
>>> CORRECTION: I meant to say Volatile and NOT Non-Volatile. RapidCache is
>>> designed around Volatile memory. I guess I was a little to excited in my
>>> response and I do apologize for that. I will provide a code comparison
>>> in my next e-mail, after I go through the existing RAM drive code.
>> To a certain extent, I see that as potentially less useful than
>> optimized for non-volatile memory.  While the current incarnation of
>> the pagecache in Linux could stand to have some serious performance
>> improvements (just think how fast things would be if we used ARC
>> instead of plain LRU), it does still do it's job well for most
>> workloads (although being able to tell the kernel to reserve some
>> portion of memory _just_ for the pagecache would be an interesting and
>> probably very useful feature).
>>
> My only concern with an ARC is CPU utilization. A lot more is required
> to manage two lists.
Actually, most of the CPU time spent in an ARC cache is in the 
auto-tuning (the 'adaptive' bit), I've done testing just in userspace 
and SLRU (ARC without the adaptive sizing of the lists) uses only a 
little more CPU time than traditional LRU, somewhat less than ARC, and 
does a much better job of handling COW based workloads.  COW is a tough 
workload for LRU caching (which is why ZFS uses ARC and not traditional 
LRU), as a read-modify-write cycle ends up with the read data not being 
needed ever again, which in turn means that MRU caching can be better in 
may cases for heavy read-write COW workloads.


Download attachment "smime.p7s" of type "application/pkcs7-signature" (3019 bytes)