lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1742800203.git.ojaswin@linux.ibm.com>
Date: Mon, 24 Mar 2025 13:06:58 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: linux-ext4@...r.kernel.org, "Theodore Ts'o" <tytso@....edu>
Cc: John Garry <john.g.garry@...cle.com>, dchinner@...hat.com,
        "Darrick J . Wong" <djwong@...nel.org>,
        Ritesh Harjani <ritesh.list@...il.com>, linux-kernel@...r.kernel.org
Subject: [RFC v3 00/11] ext4: Add extsize and forcealign support (groundwork for multi block atomic writes)

These patches lay the ground work for supporting multi block
HW-accelerated atomic writes without the use of bigalloc. Multiblock
atomic write support with bigalloc is already posted as an RFC here [3].
Without bigalloc, we need a mechanism to get aligned blocks from the
allocator so that HW accelerated atomic writes can be performed. extsize
+ forcealign provide this mechanism in ext4.

[3] https://lore.kernel.org/linux-ext4/cover.1742699765.git.ritesh.list@gmail.com/

- extsize is a per inode hint to physically and logically align blocks
  to a certain value.

- forcealign gives a **strict guarantee** that allocator will physically
  as well as logically align blocks to the extsize value

The extsize support is almost same as v2 with rebase to latest ext4 dev
branch. Patches 7 - 11 adds the new forcealign feature that can be
seen like a sort of per file bigalloc. Some points about forcealign:

 * Allocation on a forcealign inode is guaranteed to get an extent
   aligned to extsize physicall and logically, else error is returned. 
   This mimicks bigalloc but on a per file level

 * Deallocations are also only allowed in extsize aligned units. This is
   pretty strict and can be relaxed in later revisions.

 * FS_XFLAG_FORCEALIGN can be set via FS_IOC_GET/SETXATTR ioctl to
   enable forcealign. As of now, we can only enable forcealign if
   extsize is set on the inode

 * Reused the EXT4_EOFBLOCKS_FL flag for forcealig since it is no longer
   used. Incase this is not feasible, we can explore other ways to set
   the flag (eg xattr or overriding a field)

Some of the TODOs and open questions regarding the design:

1. I want to design forcealign in such a way that FS formatting is not
  required. For that Im exploring 2 options:

  - Add an RO_COMPAT feature flag. tune2fs can be used to enable it on
    existing filesystems without formatting. Simplest but this has a
    drawback that even for a single forcealign file, the FS would become
    RO on older kernels

  - To avoid that, we can instead expose an ioctl to fix a misaligned
    forcealign file. However this is an overhead for sys admins/end
    users. Maybe fsck can help with this?

2. For extsize, I'm not planning to support FS-wide tunable since we
   already have bigalloc for that.

3. Also, we are not supporting non-power-of-2 extsizes (atleast for now)
   as there are no clear use cases to justify the added complexity 

4. directory wide extsize is not yet supported however can be added in 
   future revision

We are passing quick xfstests with these patches along with a lot of
custom allocation scenarios that I'll eventually add to xfstest, however
this series is still largely an RFC and might have bugs.

Posting this here for review and suggestions on the design as well as
implementation. 


** Changes since rfc v2 [2] **

 - Patch 0-6 are same as v2 just rebased. Patch 7-11 are new in this
   series.
 - Patch 7 adds a wrapper on ext4_map_blocks to better handle some
   allocation scenarios
 - Patch 8-11 Add a new called forcealign. More about it below.

[2] https://lore.kernel.org/linux-ext4/cover.1733901374.git.ojaswin@linux.ibm.com/

** Changes since rfc v1 [1] **

1. Allocations beyond EOF also respect extsize hint however we 
   unlink XFS, we don't trim the blocks allocated beyond EOF due
   to extsize hints. The reasoning behind this is explained in 
   patch 6/6.

2. Minor fixes in extsize ioctl handling logic.

Rest of the design detials can be in individual patches as well as
the original cover leter which can be found here:

[1]
https://lore.kernel.org/linux-ext4/cover.1726034272.git.ojaswin@linux.ibm.com/

Comments and suggestions are welcome!

Regards,
ojaswin

Ojaswin Mujoo (11):
  ext4: add aligned allocation hint in mballoc
  ext4: allow inode preallocation for aligned alloc
  ext4: support for extsize hint using FS_IOC_FS(GET/SET)XATTR
  ext4: pass lblk and len explicitly to ext4_split_extent*()
  ext4: add extsize hint support
  ext4: make extsize work with EOF allocations
  ext4: add ext4_map_blocks_extsize() wrapper to handle overwrites
  ext4: add forcealign support of mballoc
  ext4: add forcealign support to ext4_map_blocks
  ext4: add support for adding focealign via SETXATTR ioctl
  ext4: disallow unaligned deallocations on forcealign inodes

 fs/ext4/ext4.h              |  20 +-
 fs/ext4/ext4_jbd2.h         |  23 ++
 fs/ext4/extents.c           | 294 ++++++++++++++++---
 fs/ext4/inode.c             | 543 +++++++++++++++++++++++++++++++++---
 fs/ext4/ioctl.c             | 191 +++++++++++++
 fs/ext4/mballoc.c           | 141 ++++++++--
 fs/ext4/super.c             |   1 +
 include/trace/events/ext4.h |   3 +
 include/uapi/linux/fs.h     |   6 +-
 9 files changed, 1111 insertions(+), 111 deletions(-)

-- 
2.48.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ