linux-kernel - Re: [dm-devel] [PATCH] staging: writeboost: Add dm-writeboost

Open Source and information security mailing list archives

Message-ID: <548AE4E2.5080508@redhat.com>
Date:	Fri, 12 Dec 2014 13:51:46 +0100
From:	Marian Csontos <mcsontos@...hat.com>
To:	device-mapper development <dm-devel@...hat.com>,
	gregkh@...uxfoundation.org, snitzer@...hat.com, agk@...hat.com,
	linux-kernel@...r.kernel.org
Subject: Re: [dm-devel] [PATCH] staging: writeboost: Add dm-writeboost

On 12/10/2014 11:00 AM, Joe Thornber wrote:
> On Tue, Dec 09, 2014 at 03:12:53PM +0000, Joe Thornber wrote:
>> Writeboost is significantly slower than the spindle alone for this
>> very simple test.  I do not understand what is causing the issue.
>
> I started doing the code review and now understand what's going on,
> sadly.
>
> You are splitting all bios up into 4k blocks to simplify the metadata
> layout, and mapping logic.  This murders performance.  File systems
> and the block layer try really hard to submit the largest bio possible
> for a reason.
>
> A simple dd in large chunks across your cache reveals this:
>
> raw spindle:        8.9s
> writeboost type 0:  32.2s
> writeboost type 1:  71.1s
>
> dm-cache and dm-thin do also split io into blocks, but much larger,
> user configurable blocks.  It's still a performance issue for us,
> which is why I'm using range locking to move away from this bio
> splitting (eg, recent cache discard patches).
>
> One of the main advantages of a log based metadata layout is you can
> cope nicely with arbitrarily sized bios.  Unlike dm-cache for
> instance, which has to do a read from the origin if it wants to cache
> a write that partially covers a block (or maintain a 'valid' bit for
> each sector of every cached block).
>
> The writeboost target as it stands will only benefit v. small, random
> io.  It will seriously degrade performance of any other IO profile.
> I'm NACKing this for upstream, and will not be spending any more time
> on it at this point.

Is not that what some databases are doing?

>
> You've put a lot of effort into this so far, so I suggest you redesign
> the log metadata, and drop the io splitting; you'll end up with
> something far better.

Perhaps passing large writes[1] directly to HDD - consumer SSDs and HDDs 
sequential write speeds are IIUC almost identical.

[1]: What is large write? In my mental model fits a "tunable".

>
> Sorry,
>
> - Joe
>
> --
> dm-devel mailing list
> dm-devel@...hat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives