lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20200419164812.909D45204F@d06av21.portsmouth.uk.ibm.com>
Date:   Sun, 19 Apr 2020 22:18:11 +0530
From:   Ritesh Harjani <riteshh@...ux.ibm.com>
To:     Eric Sandeen <sandeen@...deen.net>,
        "linux-ext4@...r.kernel.org" <linux-ext4@...r.kernel.org>
Cc:     Jan Kara <jack@...e.cz>, "Theodore Ts'o" <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>
Subject: Re: strange allocator behavior on a 2k block fs, skipping free blocks

Hello All,

On 4/17/20 12:46 AM, Eric Sandeen wrote:
> This got picked up by xfstests generic/018 on a 2k block filesystem when it
> failed to defragment a file into 1 extent as expected.
> 
> For some reason, the allocator is skipping over free blocks when it allocates
> the donor file.  The attached image shows this behavior - if you do:
> 
> # bunzip2 ext4.img.qcow.bz2
> # qemu-img convert -O raw ext4.img.qcow ext4.img
> # mkdir -p mnt
> # mount -o loop ext4.img mnt/
> # fallocate -l 20480 mnt/newfile
> # filefrag -v mnt/newfile
> Filesystem type is: ef53
> File size of mnt/newfile is 20480 (10 blocks of 2048 bytes)
>   ext:     logical_offset:        physical_offset: length:   expected: flags:
>     0:        0..       1:      16962..     16963:      2:             unwritten
>     1:        2..       9:      16968..     16975:      8:      16964: unwritten,eof
> mnt/newfile: 2 extents found
> 
> it allocates 2 extents, even though the blocks in between the extents are free:
> 
> # dumpe2fs test.img | grep -w 16964
> dumpe2fs 1.42.9 (28-Dec-2013)
>    Free blocks: 16964-16967, 16976-17407, 17410-17919, 17922-18431, 18434-18943, 18946-19455, 19457-19967, 19969-32767
> 

So my initial investigation on this says that below is what is
happening. Also verified by logs.
1. Initially when the fallocate blocks are requested with length of 10 
blocks. (please note in fallocate path we don't set the
EXT4_MB_HINT_TRY_GOAL).
	-> For blocks of length 10 (since length of not order of 2
multiple), we chose allocation criteria as 1. And go for
ext4_mb_scan_aligned() with stripe size as 2. So in that function
we only look for 2 blocks as needed blocks(since stripe size is 2
blocks) and we return this 2 blocks as the allocated blocks from
ext4_map_blocks.
This is where we get the blocks as (16962, 16963).

2. Now again fallocate path request for remaining length which is 8.
At this time, since 8 is equal 2^3 request. So we go with criteria
as 0. And try the allocation path via ext4_mb_simple_scan_group().

In 2nd iteration, buddy structures are scanned to find the right fit of 
the block. That's why we see two extents in above results.

I guess if we make stripe size as 0, then I don't think we will see
this problem.

> I suppose this isn't critical, as defrag is best-effort and the allocator doesn't ever guarantee contiguous allocations, but it still seems a little odd so just thought I'd highlight it.

But others can tell if this is really a problem which needs fixing in
the long run?

-ritesh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ