lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <38626BFC-A2BB-468D-8297-51F7A887859F@whamcloud.com>
Date:	Thu, 19 Apr 2012 16:55:11 -0600
From:	Andreas Dilger <adilger@...mcloud.com>
To:	Ted Ts'o <tytso@....edu>
Cc:	Eric Sandeen <sandeen@...hat.com>, linux-fsdevel@...r.kernel.org,
	Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH, RFC 3/3] ext4: use the O_HOT and O_COLD open flags to influence inode allocation

On 2012-04-19, at 1:59 PM, Ted Ts'o wrote:
> On Thu, Apr 19, 2012 at 02:45:28PM -0500, Eric Sandeen wrote:
>> 
>> I'm curious to know how this will work for example on a linear device
>> make up of rotational devices (possibly a concat of raids, etc).
>> 
>> At least for dm, it will be still marked as rotational,
>> but the relative speed of regions of the linear device can't be inferred from the offset within the device.
> 
> Hmm, good point.  We need a way to determine whether this is some kind
> of glued-together dm thing versus a plain-old HDD.

I would posit that in a majority of cases that low-address blocks
are much more likely to be "fast" than high-address blocks.  This
is true for RAID-0,1,5,6, most LVs built atop those devices (since
they are allocated from low-to-high offset order).

It is true that some less common configurations (the above dm-concat)
may not follow this rule, but in that case the filesystem is not
worse off compared to not having this information at all.

>> Do we really have enough information about the storage under us to
>> know what parts are "fast" and what parts are "slow?"
> 
> Well, plain and simple HDD's are still quite common; not everyone
> drops in an intermediate dm layer.  I view dm as being similar to
> enterprise storage arrays where we will need to pass down an explicit
> hint with block ranges down to the storage device.  However, it's
> going to be a long time before we get that part of the interface
> plumbed in.
> 
> In the meantime, it would be nice if we had something that worked in
> the common case of plain old stupid HDD's --- we just need a way of
> determining that's what we are dealing with.

Also, if the admin knows (or can control) what these hints mean, then
they can configure the storage explicitly to match the usage.  I've
long been a proponent of configuring LVs with hybrid SSD+HDD storage,
so that ext4 can allocate inodes + directories on the SSD part of each
flex_bg, and files on the RAID-6 part of the flex_bg.  This kind of
API would allow files to be hinted similarly.

While having flexible kernel APIs that allowed the upper layers to
understand the underlying layout would be great, I also don't imagine
that this will arrive any time soon.  It will also take userspace and
application support to be able to leverage that, and we have to start
somewhere.

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Lustre Engineer            http://www.whamcloud.com/




--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ