[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZRWQcRtoneJD06UP@li-bb2b2a4c-3307-11b2-a85c-8fa5c3a69313.ibm.com>
Date: Thu, 28 Sep 2023 20:10:49 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Bobi Jam <bobijam@...mail.com>
Cc: linux-ext4@...r.kernel.org, Andreas Dilger <adilger@...ger.ca>
Subject: Re: [PATCH v3] ext4: optimize metadata allocation for hybrid LUNs
On Tue, Sep 12, 2023 at 02:59:24PM +0800, Bobi Jam wrote:
> With LVM it is possible to create an LV with SSD storage at the
> beginning of the LV and HDD storage at the end of the LV, and use that
> to separate ext4 metadata allocations (that need small random IOs)
> from data allocations (that are better suited for large sequential
> IOs) depending on the type of underlying storage. Between 0.5-1.0% of
> the filesystem capacity would need to be high-IOPS storage in order to
> hold all of the internal metadata.
>
> This would improve performance for inode and other metadata access,
> such as ls, find, e2fsck, and in general improve file access latency,
> modification, truncate, unlink, transaction commit, etc.
>
> This patch split largest free order group lists and average fragment
> size lists into other two lists for IOPS/fast storage groups, and
> cr 0 / cr 1 group scanning for metadata block allocation in following
> order:
>
> if (allocate metadata blocks)
> if (cr == 0)
> try to find group in largest free order IOPS group list
> if (cr == 1)
> try to find group in fragment size IOPS group list
> if (above two find failed)
> fall through normal group lists as before
> if (allocate data blocks)
> try to find group in normal group lists as before
> if (failed to find group in normal group && mb_enable_iops_data)
> try to find group in IOPS groups
>
> Non-metadata block allocation does not allocate from the IOPS groups
> if non-IOPS groups are not used up.
>
> Add for mke2fs an option to mark which blocks are in the IOPS region
> of storage at format time:
>
> -E iops=0-1024G,4096-8192G
>
> so the ext4 mballoc code can then use the EXT4_BG_IOPS flag in the
> group descriptors to decide which groups to allocate dynamic
> filesystem metadata.
>
> Signed-off-by: Bobi Jam <bobijam@...mail.com
>
> --
> v2->v3: add sysfs mb_enable_iops_data to enable data block allocation
> from IOPS groups.
> v1->v2: for metadata block allocation, search in IOPS list then normal
> list.
> ---
Hi Bobi, Andreas,
So I took a look at this patch and the idea is definitely interesting!
I'll add my review comments inline in a separate mail, but just adding
some high level observations in this mail:
1. Since most of the times our metadata allocation would only request
1 block, we will actually end up skipping CR_POWER2_ALIGNED (aka CR0)
since it only works for len >= 2. But I think it's okay cause some
metadata allocaitons like xattrs might benefit from it.
2. We always try the goal group first in ext4_mb_find_by_goal() before
going through the mballoc criterias and I dont think there is any logic
to stop that incase the goal group is non IOPS and metadata is being
allocated. So I think we are relying on the goal finding logic to give
us IOPS blocks as goal for metadata, but does it do that currently?
Thanks!
ojaswin
Powered by blists - more mailing lists