[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1359527713.648.140661184334613.06CF38D4@webmail.messagingengine.com>
Date: Wed, 30 Jan 2013 17:35:13 +1100
From: Bron Gondwana <brong@...tmail.fm>
To: Eric Sandeen <sandeen@...hat.com>
Cc: linux-ext4@...r.kernel.org, Rob Mueller <robm@...tmail.fm>
Subject: Re: fallocate creating fragmented files
On Wed, Jan 30, 2013, at 05:05 PM, Eric Sandeen wrote:
> On 1/29/13 11:46 PM, Bron Gondwana wrote:
> > Hi All,
> >
> > I'm trying to understand why my ext4 filesystem is creating highly fragmented files even though it's only just over 50% full.
>
> It's at least possible that freespace is very fragmented; you could try the "e2freefrag" command to see.
[brong@...p14 ~]$ e2freefrag /dev/md0
Device: /dev/md0
Blocksize: 1024 bytes
Total blocks: 62522624
Free blocks: 26483551 (42.4%)
Min. free extent: 1 KB
Max. free extent: 757 KB
Avg. free extent: 14 KB
Num. free extent: 1940838
HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range : Free extents Free Blocks Percent
1K... 2K- : 538480 538480 2.03%
2K... 4K- : 362189 870860 3.29%
4K... 8K- : 321158 1681591 6.35%
8K... 16K- : 268848 2934959 11.08%
16K... 32K- : 210746 4697440 17.74%
32K... 64K- : 151755 6738418 25.44%
64K... 128K- : 63761 5512870 20.82%
128K... 256K- : 20563 3552580 13.41%
256K... 512K- : 3308 1047995 3.96%
512K... 1024K- : 30 17615 0.07%
> > Now looking at the verbose output, we can see that there are many extents of just 3 or 4 blocks:
> >
> > [brong@...p14 conf]$ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | head
> > 2
> > 1 is
> > 1 length
> > 1 unwritten
> > 6 3
> > 10 4
> > 6 5
> > 5 6
> > 3 7
> > 1 8
>
> But longer extents too, right:
>
> $ filefrag -v testfile | awk '{print $5}' | sort -n | uniq -c | tail
> 1 162
> 1 164
> 1 179
> 1 188
> 1 215
> 1 231
> 1 233
> 1 255
> 1 322
> 1 357
>
> > Yet looking at the next file,
> >
> > [brong@...p14 conf]$ filefrag -v testfile2 | awk '{print $5}' | sort -n | uniq -c | tail
> > 1 173
> > 1 175
> > 1 178
> > 1 184
> > 1 187
> > 1 189
> > 1 194
> > 1 289
> > 1 321
> > 1 330
> >
>
> and presumably shorter extents at the beginning?
Well, that's sorted. Yes, there were shorter extents too.
> So it sounds like both files are a mix of long & short extents.
Definitely.
> > There are multiple extents of hundreds of blocks in length. Why weren't they used in allocating the first file?
>
> I'm not sure, offhand. But just to be clear, while contiguous allocations are usually a nice side-effect of fallocate, nothing at all guarantees it. It only guarantees that you'll have that space available for future writes.
Sure. I was hoping it would help though!
> Still, it'd be interesting to figure out why the allocator is behaving this way.
> It'd be interesting to see the freefrag info, the allocator might really be in scavenger mode.
What do you think from the output above. Is that reasonable? I'll check a more recently set-up machine.
[brong@...p30 ~]$ e2freefrag /dev/sdf1
Device: /dev/sdf1
Blocksize: 1024 bytes
Total blocks: 97124320
Free blocks: 68429391 (70.5%)
Min. free extent: 1 KB
Max. free extent: 1009 KB
Avg. free extent: 25 KB
Num. free extent: 2781696
HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range : Free extents Free Blocks Percent
1K... 2K- : 705257 705257 1.03%
2K... 4K- : 553577 1348712 1.97%
4K... 8K- : 349406 1789755 2.62%
8K... 16K- : 289102 3185026 4.65%
16K... 32K- : 279061 6307452 9.22%
32K... 64K- : 271631 12321046 18.01%
64K... 128K- : 205191 18340308 26.80%
128K... 256K- : 110082 19121199 27.94%
256K... 512K- : 16962 5584384 8.16%
512K... 1024K- : 1427 882388 1.29%
This one is 100Gb SSDs from some other vendor (can't remember which) on hardware RAID1. It's never been more than about 30% full. It looks like a similar histogram of extent sizes. Again it's a 1kb block size (piles of small files on these filesystems)
[brong@...p30 ~]$ dumpe2fs -h /dev/sdf1
dumpe2fs 1.42.4 (12-Jun-2012)
Filesystem volume name: ssd30
Last mounted on: /mnt/ssd30
Filesystem UUID: c2623b6a-b3f4-4a5a-99e3-495f29112ba6
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 12140544
Block count: 97124320
Reserved block count: 4856216
Free blocks: 68429391
Free inodes: 7157347
First block: 1
Block size: 1024
Fragment size: 1024
Reserved GDT blocks: 256
Blocks per group: 8192
Fragments per group: 8192
Inodes per group: 1024
Inode blocks per group: 256
Flex block group size: 16
Filesystem created: Tue Aug 2 07:39:40 2011
Last mount time: Thu Jan 24 23:15:41 2013
Last write time: Thu Jan 24 23:15:41 2013
Mount count: 10
Maximum mount count: 39
Last checked: Tue Aug 2 07:39:40 2011
Check interval: 15552000 (6 months)
Next check after: Sun Jan 29 06:39:40 2012
Lifetime writes: 13 TB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 0ecbfe75-57e3-4d4e-b4a8-bf0114dc0997
Journal backup: inode blocks
Journal features: journal_incompat_revoke
Journal size: 32M
Journal length: 32768
Journal sequence: 0x32367a0d
Journal start: 1537
Regards,
Bron.
--
Bron Gondwana
brong@...tmail.fm
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists