[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <98991e9d-cce0-48ad-b77c-b7d3eff71dca@suse.com>
Date: Mon, 16 Sep 2024 20:09:01 +0930
From: Qu Wenruo <wqu@...e.com>
To: Luca Stefani <luca.stefani.ge1@...il.com>
Cc: Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
David Sterba <dsterba@...e.com>, linux-btrfs@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/3] btrfs: Split remaining space to discard in chunks
在 2024/9/16 19:46, Luca Stefani 写道:
> Per Qu Wenruo in case we have a very large disk, e.g. 8TiB device,
> mostly empty although we will do the split according to our super block
> locations, the last super block ends at 256G, we can submit a huge
> discard for the range [256G, 8T), causing a super large delay.
>
> We now split the space left to discard based on BTRFS_MAX_DATA_CHUNK_SIZE
> in preparation of introduction of cancellation signals handling.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
> Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
> Signed-off-by: Luca Stefani <luca.stefani.ge1@...il.com>
> ---
> fs/btrfs/extent-tree.c | 24 +++++++++++++++++++-----
> 1 file changed, 19 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index a5966324607d..cbe66d0acff8 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1239,7 +1239,7 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
> u64 *discarded_bytes)
> {
> int j, ret = 0;
> - u64 bytes_left, end;
> + u64 bytes_left, bytes_to_discard, end;
> u64 aligned_start = ALIGN(start, 1 << SECTOR_SHIFT);
>
> /* Adjust the range to be aligned to 512B sectors if necessary. */
> @@ -1300,13 +1300,27 @@ static int btrfs_issue_discard(struct block_device *bdev, u64 start, u64 len,
> bytes_left = end - start;
> }
>
> - if (bytes_left) {
> + while (bytes_left) {
> + if (bytes_left > BTRFS_MAX_DATA_CHUNK_SIZE)
> + bytes_to_discard = BTRFS_MAX_DATA_CHUNK_SIZE;
That MAX_DATA_CHUNK_SIZE is only possible for RAID0/RAID10/RAID5/RAID6,
by spanning the device extents across multiple devices.
For each device, the maximum size is limited to 1G (check
init_alloc_chunk_ctl_policy_regular()).
So you can just limit it to 1G instead.
(If you want, you can also extract that into a macro as a cleanup).
Furthermore, you can use min() instead of a if ().
So you only need:
bytes_to_discard = min(SZ_1G, bytes_left);
Otherwise this looks good enough to me.
If the 1G size is not good enough, we can later tune it to smaller values.
Personally speaking I think 1G would be enough.
Thanks,
Qu
> + else
> + bytes_to_discard = bytes_left;
> +
> ret = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT,
> - bytes_left >> SECTOR_SHIFT,
> + bytes_to_discard >> SECTOR_SHIFT,
> GFP_NOFS);
> - if (!ret)
> - *discarded_bytes += bytes_left;
> +
> + if (ret) {
> + if (ret != -EOPNOTSUPP)
> + break;
> + continue;
> + }
> +
> + start += bytes_to_discard;
> + bytes_left -= bytes_to_discard;
> + *discarded_bytes += bytes_to_discard;
> }
> +
> return ret;
> }
>
Powered by blists - more mailing lists