[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <9500D51F-7E89-41F2-9A77-0E1A79136240@dilger.ca>
Date: Fri, 10 Jun 2011 14:37:11 -0600
From: Andreas Dilger <adilger@...ger.ca>
To: Phillip Susi <psusi@....rr.com>
Cc: linux-ext4@...r.kernel.org
Subject: Re: 64bit filesystem questions
On 2011-06-10, at 11:45 AM, Phillip Susi wrote:
> On 6/10/2011 1:29 PM, Andreas Dilger wrote:
>> On 2011-06-10, at 11:14 AM, Phillip Susi wrote:
>>> On 6/10/2011 12:19 PM, Andreas Dilger wrote:
>>>> I think in the presence of flex_bg this issue is moot.
>>>
>>> What is the issue without flex_bg?
>>
>> No "issue" really, just that the block/inode bitmaps are spread all over
>> the filesystem. The original discussion was about whether there could be
>> "larger bitmaps that addressed more than 32768 blocks", which is essentially
>> what the flex_bg feature provides. With flex_bg the bitmaps for different
>> groups will be allocated adjacent to each other on disk, and allow addressing
>> more than 32768 blocks without any seeking.
>>
>> On large filesystems without flex_bg, the distribution of the bitmaps without
>> flex_bg means that a seek is needed to read each one, and given that spinning
>> disks have stayed at about 100 seeks/sec for decades it means 10+ minutes just
>> to read all of the bitmaps.
>>
>> On my 2TB 5400 RPM SATA drive, e2fsck time went from ~20 minutes to ~3 minutes
>> by copying the data to a new ext4 filesystem with flex_bg + extents. For a
>> fair comparison, I then reformatted the original (identical) disk without
>> flex_bg or extents and copied the data back, so that there wasn't any unfair
>> comparison between the newly-formatted filesystem and the old fragmented one.
>
> I know what flex_bg is; what I don't understand is what it has to do with the limit on the size of a block group. Whether the block bitmaps are stored in their native block group, or clustered up with flex_bg does not seem to have anything to do with whether or not the size of the bitmap can exceed 32k blocks.
I hope it is obvious that a single bitmap block can only address the number
of bits (==blocks) that fit within that block. To address more blocks the
block bitmap needs to be larger than a single block in size. One possible
way to do this (discussed early on for ext4) would be to have N block
bitmap blocks per group. That raises issues of how to address those blocks
for each "block group", and what the meaning of a "block group" really is.
The other (very similar, but not identical) approach is to essentially merge
N adjacent "block groups" into a single "large block group" that has N block
bitmaps, and addresses N * blocksize * 8 blocks per "large block group".
In this case "N" is the flex_bg factor (constrained to 2^n), and the "large
block group" is called a "flex group". It achieves exactly the same thing
as having N block bitmaps per group, with the only difference that there are
N group descriptors that point to the bitmaps, and they no longer have to be
located within the groups themselves
There is virtually no difference between "larger bitmap" and "flex_bg":
"b"=block bitmap, "i"=inode bitmap, "."=data block
Non-flex_bg configuration for 4 groups * 32768 blocks:
bi...{32760}...bi...{32760}...bi...{32760}...bi...{32760}...
Each block bitmap addresses 32768 blocks in total (including itself).
flex_bg configuration for the same 4 groups * 32768 blocks:
bbbbiiii.....................{131020}.......................
If you treat the four "bbbb" blocks as a single block bitmap, and "iiii"
as a single inode bitmap, and the contiguous range of free blocks as a
single group, it is exactly what you are asking for - a larger bitmap.
Cheers, Andreas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists