[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <EAD1A1FC-DCF5-4B2B-BDE9-38E593691E23@dilger.ca>
Date: Thu, 26 Mar 2020 15:07:23 -0600
From: Andreas Dilger <adilger@...ger.ca>
To: harshad shirwadkar <harshadshirwadkar@...il.com>
Cc: linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 2/2] ext4: shrink directories on dentry delete
On Mar 26, 2020, at 1:49 PM, harshad shirwadkar <harshadshirwadkar@...il.com> wrote:
>
> On Wed, Mar 25, 2020 at 3:06 AM Andreas Dilger <adilger@...ger.ca> wrote:
>>
>> On Mar 25, 2020, at 3:37 AM, Harshad Shirwadkar <harshadshirwadkar@...il.com> wrote:
>>> But note that most of the shrinking happens during last 1-2% deletions
>>> in an average case. Therefore, the next step here is to merge dx nodes
>>> when possible. That can be achieved by storing the fullness index in
>>> htree nodes. But that's an on-disk format change. We can instead build
>>> on tooling added by this patch to perform reverse lookup on a dx
>>> node and then reading adjacent nodes to check their fullness.
>>
>> Thank you for updating these patches again. I haven't had a chance to look
>> at them yet, but I hope to review the patches in the near future.
>>
>> As for storing the fullness on disk changing the on-disk format... That is
>> true, but the original htree implementation anticipated this and reserved
>> space in the htree index to store the fullness, so it would not break the
>> ability of older kernels to access directories with the fullness information.
>>
> Yeah, you are right, good to know that we have bits reserved already
> and that wouldn't break older kernels if we use these in future.
>> I think if you used just a few bits (maybe just 2) to store:
>> 0 = unset (every directory today)
>> 1 = under 20% full
>> 2 = under 40% full
>> 3 = under 60% full
>>
>> or similar. It doesn't matter if they are more full since they won't be
>> candidates for merging, and then lazily update the htree index fullness
>> as entries are removed, this will simplify the shrinking process, and will
>> avoid the need to repeatedly scan the leaf blocks to see if they are empty
>> enough for merging. It wouldn't be any worse *not* to store these values
>> on disk after the first time a "0 = unset" entry was found and not merged,
>> or setting the fullness on the merged block if it is merged, and running
>> "e2fsck -D" can easily update the fullness values.
>>
>> The benefit of using 20%, 40%, and 60% as the fullness markers is that it
>> is possible to either merge adjacent 60% and 40% blocks or alternately a
>> 60% and two adjacent 20% blocks. Also, since these values are very coarse
>> they would not need to be updated frequently. If the values are slightly
>> outdated, then it is again not worse than the "always scan" model (one scan
>> and the fullness would be updated), but more efficient than repeat scanning.
>>
>> Using only two bits for fullness also leaves two bits free for future use.
>
> Thanks Andreas, that makes sense. This kind of merging will require
> lot of tooling provided in this patch - for example swapping out freed
> block with last block to not leave any holes. So, my hope is that we
> get this patch in first and thereby get a step closer to coalescing
> solution.
Definitely I *do not* want to block the landing of these initial patches
until a "full featured" directory shrinking is complete. These patches
at least provide some basic functionality, and will at least shrink a
large directory if it becomes totally empty so I'm in favour of that.
Cheers, Andreas
Download attachment "signature.asc" of type "application/pgp-signature" (874 bytes)
Powered by blists - more mailing lists