linux-kernel - Re: O_DIRECT to md raid 6 is slow

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrViO5umuDQX1a6v=eKOMK8aAr4o7QMiWybv2in=XwO8CQ@mail.gmail.com>
Date:	Wed, 15 Aug 2012 18:08:43 -0700
From:	Andy Lutomirski <luto@...capital.net>
To:	stan@...dwarefreak.com
Cc:	John Robinson <john.robinson@...nymous.org.uk>,
	linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org
Subject: Re: O_DIRECT to md raid 6 is slow

On Wed, Aug 15, 2012 at 4:50 PM, Stan Hoeppner <stan@...dwarefreak.com> wrote:
> On 8/15/2012 5:10 PM, Andy Lutomirski wrote:
>> On Wed, Aug 15, 2012 at 3:00 PM, Stan Hoeppner <stan@...dwarefreak.com> wrote:
>>> On 8/15/2012 12:57 PM, Andy Lutomirski wrote:
>>>> On Wed, Aug 15, 2012 at 4:50 AM, John Robinson
>>>> <john.robinson@...nymous.org.uk> wrote:
>>>>> On 15/08/2012 01:49, Andy Lutomirski wrote:
>>>>>>
>>>>>> If I do:
>>>>>> # dd if=/dev/zero of=/dev/md0p1 bs=8M
>>>>>
>>>>> [...]
>>
>> Grr.  I thought the bad old days of filesystem and related defaults
>> sucking were over.
>
> The previous md chunk default of 64KB wasn't horribly bad, though still
> maybe a bit high for alot of common workloads.  I didn't have eyes/ears
> on the discussion and/or testing process that led to the 'new' 512KB
> default.  Obviously something went horribly wrong here.  512KB isn't a
> show stopper as a default for 0/1/10, but is 8-16 times too large for
> parity RAID.
>
>> cryptsetup aligns sanely these days, xfs is
>> sensible, etc.
>
> XFS won't align with the 512KB chunk default of metadata 1.2.  The
> largest XFS journal stripe unit (su--chunk) is 256KB, and even that
> isn't recommended.  Thus mkfs.xfs throws an error due to the 512KB
> stripe.  See the md and xfs archives for more details, specifically Dave
> Chinner's colorful comments on the md 512KB default.

Heh -- that's why the math didn't make any sense :)

>
>> wtf?  <rant>Why is there no sensible filesystem for
>> huge disks?  zfs can't cp --reflink and has all kinds of source
>> availability and licensing issues, xfs can't dedupe at all, and btrfs
>> isn't nearly stable enough.</rant>
>
> Deduplication isn't a responsibility of a filesystem.  TTBOMK there are
> two, and only two, COW filesystems in existence:  ZFS and BTRFS.  And
> these are the only two to offer a native dedupe capability.  They did it
> because they could, with COW, not necessarily because they *should*.
> There are dozens of other single node, cluster, and distributed
> filesystems in use today and none of them support COW, and thus none
> support dedup.  So to *expect* a 'sensible' filesystem to include dedupe
> is wishful thinking at best.

I should clarify my rant for the record.  I don't care about in-fs
dedupe.  I want COW so userspace can dedupe and generally replace
hardlinks with sensible cowlinks.  I'm also working on some fun tools
that *require* reflinks for anything resembling decent performance.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/