lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 21 Apr 2010 14:59:21 -0400
From:	Greg Freemyer <greg.freemyer@...il.com>
To:	Eric Sandeen <sandeen@...hat.com>
Cc:	Mark Lord <kernel@...savvy.com>,
	Lukas Czerner <lczerner@...hat.com>,
	linux-ext4@...r.kernel.org, Jeff Moyer <jmoyer@...hat.com>,
	Edward Shishkin <eshishki@...hat.com>,
	Eric Sandeen <esandeen@...hat.com>,
	Ric Wheeler <rwheeler@...hat.com>
Subject: Re: [PATCH 2/2] Add batched discard support for ext4.

On Tue, Apr 20, 2010 at 10:45 PM, Eric Sandeen <sandeen@...hat.com> wrote:
> Mark Lord wrote:
>> On 20/04/10 05:21 PM, Greg Freemyer wrote:
>>> Mark,
>>>
>>> This is the patch implementing the new discard logic.
>> ..
>>> Signed-off-by: Lukas Czerner <lczerner@...hat.com>
>> ..
>>>> +void ext4_trim_extent(struct super_block *sb, int start, int count,
>>>> +               ext4_group_t group, struct ext4_buddy *e4b)
>>>> +{
>>>> +       ext4_fsblk_t discard_block;
>>>> +       struct ext4_super_block *es = EXT4_SB(sb)->s_es;
>>>> +       struct ext4_free_extent ex;
>>>> +
>>>> +       assert_spin_locked(ext4_group_lock_ptr(sb, group));
>>>> +
>>>> +       ex.fe_start = start;
>>>> +       ex.fe_group = group;
>>>> +       ex.fe_len = count;
>>>> +
>>>> +       mb_mark_used(e4b,&ex);
>>>> +       ext4_unlock_group(sb, group);
>>>> +
>>>> +       discard_block = (ext4_fsblk_t)group *
>>>> +                       EXT4_BLOCKS_PER_GROUP(sb)
>>>> +                       + start
>>>> +                       + le32_to_cpu(es->s_first_data_block);
>>>> +       trace_ext4_discard_blocks(sb,
>>>> +                       (unsigned long long)discard_block,
>>>> +                       count);
>>>> +       sb_issue_discard(sb, discard_block, count);
>>>> +
>>>> +       ext4_lock_group(sb, group);
>>>> +       mb_free_blocks(NULL, e4b, start, ex.fe_len);
>>>> +}
>>>
>>> Mark, unless I'm missing something, sb_issue_discard() above is going
>>> to trigger a trim command for just the one range.  I thought the
>>> benchmarks you did showed that a collection of ranges needed to be
>>> built, then a single trim command invoked that trimmed that group of
>>> ranges.
>> ..
>>
>> Mmm.. If that's what it is doing, then this patch set would be a
>> complete disaster.
>> It would take *hours* to do the initial TRIM.
>>
>> Lukas ?
>
> I'm confused; do we have an interface to send a trim command for multiple ranges?
>
> I didn't think so ...  Lukas' patch is finding free ranges (above a size threshold)
> to discard; it's not doing it a block at a time, if that's the concern.
>
> -Eric

Eric,

I don't know what kernel APIs have been created to support discard,
but the ATA8 draft spec. allows for specifying multiple ranges in one
trim command.

See section 7.10.3.1 and .2 of the latest draft spec.

Both talk about multiple trim ranges per trim command (think thousands
of ranges per command).

Recent hdparm versions accept a trim command argument that causes
multiple ranges to be trimmed per command.

 --trim-sector-ranges        Tell SSD firmware to discard unneeded
data sectors: lba:count ..
 --trim-sector-ranges-stdin  Same as above, but reads lba:count pairs from stdin

As I understand it, this is critical from a performance perspective
for the SSDs Mark tested with.  ie. He found a single trim command
with 1000 ranges takes much less time than 1000 discrete trim
commands.

Per Mark's comment's in wiper.sh, a trim command can have a minimum of
128KB of associated range information, so it is thousands of ranges
that can be discarded in a single command

ie. hdparm can accept extremely large lists of ranges on stdin, but it
parses the list into discrete trim commands with thousands of ranges
per command.

A kernel implementation which is trying to implement after that fact
discards as this patch is doing, also needs to somehow craft trim
commands with a large payload of ranges if it is going to be
efficient.

If the block layer cannot do this yet, then in my opinion this type of
batched discarding needs to stay in user space as done with Mark's
wiper.sh script and enhanced hdparm until the block layer grows that
ability.

Greg
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ