lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 26 Apr 2010 13:52:42 -0400
From:	Ric Wheeler <rwheeler@...hat.com>
To:	Lukas Czerner <lczerner@...hat.com>
CC:	Jan Kara <jack@...e.cz>, Greg Freemyer <greg.freemyer@...il.com>,
	Jeff Moyer <jmoyer@...hat.com>,
	Eric Sandeen <sandeen@...hat.com>,
	Mark Lord <kernel@...savvy.com>, linux-ext4@...r.kernel.org,
	Edward Shishkin <eshishki@...hat.com>,
	Eric Sandeen <esandeen@...hat.com>,
	Christoph Hellwig <hch@...radead.org>
Subject: Re: [PATCH 2/2] Add batched discard support for ext4.

On 04/26/2010 01:46 PM, Lukas Czerner wrote:
> On Mon, 26 Apr 2010, Jan Kara wrote:
>
>>> On Wed, 21 Apr 2010, Greg Freemyer wrote:
>>> And also, currently I am rewriting the patch do use rbtree instead of the
>>> bitmap, because there were some concerns of memory consumption. It is a
>>> question whether or not the rbtree will be more memory friendly.
>>> Generally I think that in most "normal" cases it will, but there are some
>>> extreme scenarios, where the rbtree will be much worse. Any comment on
>>> this ?
>>    I see two possible improvements here:
>> a) At a cost of some code complexity, you can bound the worst case by combining
>> RB-trees with bitmaps. The basic idea is that when space to TRIM gets too
>> fragmented (memory to keep to-TRIM blocks in RB-tree for a given group exceeds
>> the memory needed to keep it in a bitmap), you convert RB-tree for a
>> problematic group to a bitmap and attach it to an appropriate RB-node. If you
>> track with a bitmap also a number of to-TRIM extents in the bitmap, you can
>> also decide whether it's benefitial to switch back to an RB-tree.
>
> This sounds like a good idea, but I wonder if it is worth it :
>   1. The tree will have very short life, because with next ioctl all
>   stored deleted extents will be trimmed and removed from the tree.
>   2. Also note, that the longer it lives the less fragmented it possibly
>   became.
>   3. I do not expect, that deleted ranges can be too fragmented, and
>   even if it is, it will be probably merged into one big extent very
>   soon.
>
>>
>> b) Another idea might be: When to-TRIM space is fragmented (again, let's say
>> in some block group), there's not much point in sending tiny trim commands
>> anyway (at least that's what I've understood from this discussion). So you
>> might as well stop maintaining information which blocks we need to trim
>> for that group. When the situation gets better, you can always walk block
>> bitmap and issue trim commands. You might even trigger this rescan from
>> kernel - if you'd maintain number of free block extents for each block group
>> (which is rather easy), you could trigger the bitmap rescan and trim as soon
>> as ratio number of free blocks / number of extents gets above a reasonable
>> threshold.
>>
>> 								Honza
>>
>
> In what I am preparing now, I simple ignore small extents, which would
> be created by splitting the deleted extent into smaller pieces by chunks
> of used blocks. This, in my opinion, will prevent the fragmentation,
> which otherwise may occur in the longer term (between ioctl calls).
>
> Thanks for suggestions.
> -Lukas

I am not convinced that ignoring small extents is a good idea. Remember that for 
SSD's specifically, they remap *everything* internally so our "fragmentation" 
set of small spaces could be useful for them.

That does not mean that we should not try to send larger requests down to the 
target device which is always a good idea I think :-)

ric

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ