linux-kernel - Re: [PATCH v4 00/14] forcealign for xfs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1394ceeb-ce8c-4d0f-aec8-ba93bf1afb90@oracle.com>
Date: Wed, 18 Sep 2024 11:12:47 +0100
From: John Garry <john.g.garry@...cle.com>
To: Dave Chinner <david@...morbit.com>
Cc: Ritesh Harjani <ritesh.list@...il.com>, chandan.babu@...cle.com,
        djwong@...nel.org, dchinner@...hat.com, hch@....de,
        viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
        linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-fsdevel@...r.kernel.org, catherine.hoang@...cle.com,
        martin.petersen@...cle.com
Subject: Re: [PATCH v4 00/14] forcealign for xfs

On 17/09/2024 23:27, Dave Chinner wrote:
>> # xfs_bmap -vvp  mnt/file
>> mnt/file:
>> EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL FLAGS
>>    0: [0..15]:         384..399          0 (384..399)          16 010000
>>    1: [16..31]:        400..415          0 (400..415)          16 000000
>>    2: [32..127]:       416..511          0 (416..511)          96 010000
>>    3: [128..255]:      256..383          0 (256..383)         128 000000
>> FLAG Values:
>>     0010000 Unwritten preallocated extent
>>
>> Here we have unaligned extents wrt extsize.
>>
>> The sub-alloc unit zeroing would solve that - is that what you would still
>> advocate (to solve that issue)?
> Yes, I thought that was already implemented for force-align with the
> DIO code via the extsize zero-around changes in the iomap code. Why
> isn't that zero-around code ensuring the correct extent layout here?

I just have not included the extsize zero-around changes here. They were 
just grouped with the atomic writes support, as they were added 
specifically for the atomic writes support. Indeed - to me at least - it 
is strange that the DIO code changes are required for XFS forcealign 
implementation. And, even if we use extsize zero-around changes for DIO 
path, what about buffered IO?

BTW, I still have concern with this extsize zero-around change which I 
was making:

xfs_iomap_write_unwritten()
{
	unsigned int rounding;

	/* when converting anything unwritten, we must be spanning an 	alloc 
unit, so round up/down */
	if (rounding > 1) {
		offset_fsb = rounddown(rounding);
		count_fsb = roundup(rounding);
	}

	...
	do {
		xfs_bmapi_write();
		...
		xfs_trans_commit();
	} while ();
}

As mentioned elsewhere, it's a bit of a bodge (to do this rounding).

> 
>>> FWIW, I also understand things are different if we are doing 128kB
>>> atomic writes on 16kB force aligned files. However, in this
>>> situation we are treating the 128kB atomic IO as eight individual
>>> 16kB atomic IOs that are physically contiguous.
>> Yes, if 16kB force aligned, userspace can only issue 16KB atomic writes.
> Right, but the eventual goal (given the statx parameters) is to be
> able to do 8x16kB sequential atomic writes as a single 128kB IO, yes?

No, if atomic write unit max is 16KB, then userspace can only issue a 
single 16KB atomic write.

However, some things to consider:
a. the block layer may merge those 16KB atomic writes
b. userspace may also merge 16KB atomic writes and issue a larger atomic 
write (if atomic write unit max is > 16KB)

I had been wondering if there is any value in a lib for helping with b.

> 
>>>>> Again, this is different to the traditional RT file behaviour - it
>>>>> can use unwritten extents for sub-alloc-unit alignment unmaps
>>>>> because the RT device can align file offset to any physical offset,
>>>>> and issue unaligned sector sized IO without any restrictions. Forced
>>>>> alignment does not have this freedom, and when we extend forced
>>>>> alignment to RT files, it will not have the freedom to use
>>>>> unwritten extents for sub-alloc-unit unmapping, either.
>>>>>
>>>> So how do you think that we should actually implement
>>>> xfs_itruncate_extents_flags() properly for forcealign? Would it simply be
>>>> like:
>>>>
>>>> --- a/fs/xfs/xfs_inode.c
>>>> +++ b/fs/xfs/xfs_inode.c
>>>> @@ -1050,7 +1050,7 @@ xfs_itruncate_extents_flags(
>>>>                   WARN_ON_ONCE(first_unmap_block > XFS_MAX_FILEOFF);
>>>>                   return 0;
>>>>           }
>>>> +	if (xfs_inode_has_forcealign(ip))
>>>> +	       first_unmap_block = xfs_inode_roundup_alloc_unit(ip,
>>>> first_unmap_block);
>>>>           error = xfs_bunmapi_range(&tp, ip, flags, first_unmap_block,
>>> Yes, it would be something like that, except it would have to be
>>> done before first_unmap_block is verified.
>>>
>> ok, and are you still of the opinion that this does not apply to rtvol?
> The rtvol is*not* force-aligned. It -may- have some aligned
> allocation requirements that are similar (i.e. sb_rextsize > 1 fsb)
> but it does*not* force-align extents, written or unwritten.
> 
> The moment we add force-align support to RT files (as is the plan),
> then the force-aligned inodes on the rtvol will need to behave as
> force aligned inodes, not "rtvol" inodes.

ok, fine

Thanks,
John