lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Sep 2023 20:10:49 +0530
From:   Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To:     Bobi Jam <bobijam@...mail.com>
Cc:     linux-ext4@...r.kernel.org, Andreas Dilger <adilger@...ger.ca>
Subject: Re: [PATCH v3] ext4: optimize metadata allocation for hybrid LUNs

On Tue, Sep 12, 2023 at 02:59:24PM +0800, Bobi Jam wrote:
> With LVM it is possible to create an LV with SSD storage at the
> beginning of the LV and HDD storage at the end of the LV, and use that
> to separate ext4 metadata allocations (that need small random IOs)
> from data allocations (that are better suited for large sequential
> IOs) depending on the type of underlying storage.  Between 0.5-1.0% of
> the filesystem capacity would need to be high-IOPS storage in order to
> hold all of the internal metadata.
> 
> This would improve performance for inode and other metadata access,
> such as ls, find, e2fsck, and in general improve file access latency,
> modification, truncate, unlink, transaction commit, etc.
> 
> This patch split largest free order group lists and average fragment
> size lists into other two lists for IOPS/fast storage groups, and
> cr 0 / cr 1 group scanning for metadata block allocation in following
> order:
> 
> if (allocate metadata blocks)
>       if (cr == 0)
>               try to find group in largest free order IOPS group list
>       if (cr == 1)
>               try to find group in fragment size IOPS group list
>       if (above two find failed)
>               fall through normal group lists as before
> if (allocate data blocks)
>       try to find group in normal group lists as before
>       if (failed to find group in normal group && mb_enable_iops_data)
>               try to find group in IOPS groups
> 
> Non-metadata block allocation does not allocate from the IOPS groups
> if non-IOPS groups are not used up.
> 
> Add for mke2fs an option to mark which blocks are in the IOPS region
> of storage at format time:
> 
>   -E iops=0-1024G,4096-8192G
> 
> so the ext4 mballoc code can then use the EXT4_BG_IOPS flag in the
> group descriptors to decide which groups to allocate dynamic
> filesystem metadata.
> 
> Signed-off-by: Bobi Jam <bobijam@...mail.com
> 
> --
> v2->v3: add sysfs mb_enable_iops_data to enable data block allocation
>         from IOPS groups.
> v1->v2: for metadata block allocation, search in IOPS list then normal
>         list.
> ---

Hi Bobi, Andreas,

So I took a look at this patch and the idea is definitely interesting!
I'll add my review comments inline in a separate mail, but just adding
some high level observations in this mail:

1. Since most of the times our metadata allocation would only request
   1 block, we will actually end up skipping CR_POWER2_ALIGNED (aka CR0)
	 since it only works for len >= 2. But I think it's okay cause some
	 metadata allocaitons like xattrs might benefit from it.

2. We always try the goal group first in ext4_mb_find_by_goal() before
   going through the mballoc criterias and I dont think there is any logic
   to stop that incase the goal group is non IOPS and metadata is being
   allocated. So I think we are relying on the goal finding logic to give
   us IOPS blocks as goal for metadata, but does it do that currently?

Thanks!
ojaswin


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ