linux-kernel - Re: [PATCH v4 00/14] forcealign for xfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zun+yci6CeiuNS2o@dread.disaster.area>
Date: Wed, 18 Sep 2024 08:12:25 +1000
From: Dave Chinner <david@...morbit.com>
To: John Garry <john.g.garry@...cle.com>
Cc: Ritesh Harjani <ritesh.list@...il.com>, chandan.babu@...cle.com,
	djwong@...nel.org, dchinner@...hat.com, hch@....de,
	viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
	linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, catherine.hoang@...cle.com,
	martin.petersen@...cle.com
Subject: Re: [PATCH v4 00/14] forcealign for xfs

On Mon, Sep 16, 2024 at 11:24:56AM +0100, John Garry wrote:
> On 16/09/2024 08:03, Dave Chinner wrote:
> > OTOH, we can't do this with atomic writes. Atomic writes require
> > some mkfs help because they require explicit physical alignment of
> > the filesystem to the underlying storage.
> 
> If we are enabling atomic writes at mkfs time, then we can ensure agsize %
> extsize == 0. That provides the physical alignment guarantee. It also makes
> sense to ensure extsize is a power-of-2.

No, mkfs does not want to align anything to "extsize". It needs to
align the filesystem geometry to be compatible with the underlying
block device atomic write alignment parameters.

We just don't care if extsize is not an exact multiple of agsize.
As long as extsize is aligned to the atomic write boundaries and the
start of the AG is aligned to atomic write boundaries, we can
allocate hardware aligned extsize sized extents from the AG.

AGs are always going to contain lots of non-aligned, randomly sized
extents for other stuff like metadata and unaligned file data.
Aligned allocation is all about finding extsized aligned free space
within the AG and has nothing to do with the size of the AG itself.

> However, extsize is re-configurble per inode. So, for an inode enabled for
> atomic writes, we must still ensure agsize % new extsize == 0 (and also new
> extsize is a power-of-2)

Ensuring that the extsize is aligned to the hardware atomic write
limits is a kernel runtime check when enabling atomic writes on an
inode.

In this case, we do not care what the AG size is - it is completely
irrelevant to these per-inode runtime checks because mkfs has
already guaranteed that the AG is correctly aligned to the
underlying hardware. That means is extsize is also aligned to the
underlying hardware, physical extent layout is guaranteed to be
compatible with the hardware constraints for atomic writes...

> > Hence we'll eventually end
> > up with atomic writes needing to be enabled at mkfs time, but force
> > align will be an upgradeable feature flag.
> 
> Could atomic writes also be an upgradeable feature? We just need to ensure
> that agsize % extsize == 0 for an inode enabled for atomic writes.

To turn the superblock feature bit on, we have to check the AGs are
correctly aligned to the *underlying hardware*. If they aren't
correctly aligned (and there is a good chance they will not be)
then we can't enable atomic writes at all. The only way to change
this is to physically move AGs around in the block device (i.e. via
xfs_expand tool I proposed).

i.e. the mkfs dependency on having the AGs aligned to the underlying
atomic write capabilities of the block device never goes away, even
if we want to make the feature dynamically enabled.

IOWs, yes, an existing filesystem -could- be upgradeable, but there
is no guarantee that is will be.

Quite frankly, we aren't going to see block devices that filesystems
already exist on suddenly sprout support for atomic writes mid-life.
Hence if mkfs detects atomic write support in the underlying device,
it should *always* modify the geometry to be compatible with atomic
writes and enable atomic write support.

Yes, that means the "incompat with reflink" issue needs to be fixed
before we take atomic writes out of experimental (i.e. we consistently
apply the same "full support" criteria we applied to DAX).

Hence by the time atomic writes are a fully supported feature, we're
going to be able to enable them by default at mkfs time for any
hardware that supports them...

> Valid
> extsize values may be quite limited, though, depending on the value of
> agsize.

No. The only limit agsize puts on extsize is that a single aligned
extent can't be larger than half the AG size. Forced alignment and
atomic writes don't change that.

-Dave.
-- 
Dave Chinner
david@...morbit.com