[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210328192503.7a5iloagbtbymvx6@riteshh-domain>
Date: Mon, 29 Mar 2021 00:55:03 +0530
From: Ritesh Harjani <ritesh.list@...il.com>
To: Harshad Shirwadkar <harshadshirwadkar@...il.com>
Cc: linux-ext4@...r.kernel.org
Subject: Re: Block allocator improvements
On 21/03/24 04:19PM, Harshad Shirwadkar wrote:
> This patch series improves cr 0 and cr 1 passes of the allocator
> signficantly. Currently, at cr 0 and 1, we perform linear lookups to
> find the matching groups. That's very inefficient for large file
> systems where there are millions of block groups. At cr 0, we only
> care about the groups that have the largest free order >= the
> request's order and at cr 1 we only care about groups where average
> fragment size > the request size. so, this patchset introduces new
> data structures that allow us to perform cr 0 lookup in constant time
> and cr 1 lookup in log (number of groups) time instead of linear.
>
> For cr 0, we add a list for each order and all the groups are enqueued
> to the appropriate list based on the largest free order in its buddy
> bitmap. This allows us to lookup a match at cr 0 in constant time.
>
> For cr 1, we add a new rb tree of groups sorted by largest fragment
> size. This allows us to lookup a match for cr 1 in log (num groups)
> time.
>
> These optimizations can be enabled by passing "mb_optimize_scan" mount
> option.
>
> These changes may result in allocations to be spread across the block
> device. While that would not matter some block devices (such as flash)
> it may be a cause of concern for other block devices that benefit from
> storing related content togetther such as disk. However, it can be
> argued that in high fragmentation scenrio, especially for large disks,
> it's still worth optimizing the scanning since in such cases, we get
> cpu bound on group scanning instead of getting IO bound. Perhaps, in
> future, we could dynamically turn this new optimization on based on
> fragmentation levels for such devices.
>
> Verified that there are no regressions in smoke tests (-g quick -c 4k).
>
> Also, to demonstrate the effectiveness for the patch series, following
> experiment was performed:
>
> Created a highly fragmented disk of size 65TB. The disk had no
> contiguous 2M regions. Following command was run consecutively for 3
> times:
>
> time dd if=/dev/urandom of=file bs=2M count=10
>
> Here are the results with and without cr 0/1 optimizations:
>
> |---------+------------------------------+---------------------------|
> | | Without CR 0/1 Optimizations | With CR 0/1 Optimizations |
> |---------+------------------------------+---------------------------|
> | 1st run | 5m1.871s | 2m47.642s |
> | 2nd run | 2m28.390s | 0m0.611s |
> | 3rd run | 2m26.530s | 0m1.255s |
> |---------+------------------------------+---------------------------|
>
> The patch [3/6] "ext4: add mballoc stats proc file" is a modified
> version of the patch originally written by Artem Blagodarenko
> (artem.blagodarenko@...il.com). With that patch, I ran following
> command with and without optimizations.
>
> dd if=/dev/zero of=/mnt/file bs=2M count=2 conv=fsync
>
> Without optimizations:
>
> useless_c0_loops: 3
> useless_c1_loops: 39
> useless_c2_loops: 0
> useless_c3_loops: 0
>
> With optimizations:
>
> useless_c0_loops: 0
> useless_c1_loops: 0
> useless_c2_loops: 0
> useless_c3_loops: 0
>
> This shows that CR0 and CR1 optimizations get rid of useless CR0 and
> CR1 loops altogether thereby significantly reducing the number of
> groups that get considered.
>
> Changes from V4:
> ----------------
> - Only minor fixes, no significant changes
>
> Harshad Shirwadkar (6):
> ext4: drop s_mb_bal_lock and convert protected fields to atomic
> ext4: add ability to return parsed options from parse_options
> ext4: add mballoc stats proc file
> ext4: add MB_NUM_ORDERS macro
> ext4: improve cr 0 / cr 1 group scanning
> ext4: add proc files to monitor new structures
>
> fs/ext4/ext4.h | 30 ++-
> fs/ext4/mballoc.c | 572 +++++++++++++++++++++++++++++++++++++++++++---
> fs/ext4/mballoc.h | 22 +-
> fs/ext4/super.c | 79 +++++--
> fs/ext4/sysfs.c | 6 +
> 5 files changed, 652 insertions(+), 57 deletions(-)
>
Completed my review of this patch series.
Apart from the issue I mentioned in patch-5 of this v5 series. The rest
of the patches looks fine to me.
Please feel free to add:
Reviewed-by: Ritesh Harjani <ritesh.list@...il.com>
Powered by blists - more mailing lists