lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKbX-dBzSC1pmPuh@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com>
Date: Thu, 21 Aug 2025 13:55:29 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: John Garry <john.g.garry@...cle.com>
Cc: Zorro Lang <zlang@...hat.com>, fstests@...r.kernel.org,
        Ritesh Harjani <ritesh.list@...il.com>, djwong@...nel.org,
        tytso@....edu, linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-ext4@...r.kernel.org
Subject: Re: [PATCH v4 11/11] ext4: Atomic write test for extent split across
 leaf nodes

On Wed, Aug 13, 2025 at 02:54:04PM +0100, John Garry wrote:
> On 10/08/2025 14:42, Ojaswin Mujoo wrote:
> > In ext4, even if an allocated range is physically and logically
> > contiguous, it can still be split into 2 extents. This is because ext4
> > does not merge extents across leaf nodes. This is an issue for atomic
> > writes since even for a continuous extent the map block could (in rare
> > cases) return a shorter map, hence tearning the write. This test creates
> > such a file and ensures that the atomic write handles this case
> > correctly
> > 
> > Signed-off-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
> > ---
> >   tests/ext4/063     | 129 +++++++++++++++++++++++++++++++++++++++++++++
> >   tests/ext4/063.out |   2 +
> >   2 files changed, 131 insertions(+)
> >   create mode 100755 tests/ext4/063
> >   create mode 100644 tests/ext4/063.out
> > 
> > diff --git a/tests/ext4/063 b/tests/ext4/063
> > new file mode 100755
> > index 00000000..40867acb
> > --- /dev/null
> > +++ b/tests/ext4/063
> > @@ -0,0 +1,129 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# In ext4, even if an allocated range is physically and logically contiguous,
> > +# it can still be split into 2 extents.
> 
> Nit: I assume that you mean "2 or more extents"
> 
> > +# This is because ext4 does not merge
> > +# extents across leaf nodes. This is an issue for atomic writes since even for
> > +# a continuous extent the map block could (in rare cases) return a shorter map,
> > +# hence tearning the write. This test creates such a file and ensures that the
> 
> tearing
> 
> > +# atomic write handles this case correctly
> > +#
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> > +_require_command "$DEBUGFS_PROG" debugfs
> > +
> > +prep() {
> > +	local bs=`_get_block_size $SCRATCH_MNT`
> > +	local ex_hdr_bytes=12
> > +	local ex_entry_bytes=12
> > +	local entries_per_blk=$(( (bs - ex_hdr_bytes) / ex_entry_bytes ))
> > +
> > +	# fill the extent tree leaf with bs len extents at alternate offsets.
> > +	# The tree should look as follows
> > +	#
> > +	#                    +---------+---------+
> > +	#                    | index 1 | index 2 |
> > +	#                    +-----+---+-----+---+
> > +	#                   +------+         +-----------+
> > +	#                   |                            |
> > +	#      +-------+-------+---+---------+     +-----+----+
> > +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1  |
> > +	#      | off:0 | off:2 |...| off:678 |     |  off:680 |
> > +	#      | len:1 | len:1 |   |  len:1  |     |   len:1  |
> > +	#      +-------+-------+---+---------+     +----------+
> > +	#
> > +	for i in $(seq 0 $entries_per_blk)
> > +	do
> > +		$XFS_IO_PROG -fc "pwrite -b $bs $((i * 2 * bs)) $bs" $testfile > /dev/null
> > +	done
> > +	sync $testfile
> > +
> > +	echo >> $seqres.full
> > +	echo "Create file with extents spanning 2 leaves. Extents:">> $seqres.full
> > +	echo "...">> $seqres.full
> > +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> > +
> > +	# Now try to insert a new extent ex(new) between ex(n) and ex(n+1).
> > +	# Since this is a new FS the allocator would find continuous blocks
> > +	# such that ex(n) ex(new) ex(n+1) are physically(and logically)
> > +	# contiguous. However, since we dont merge extents across leaf we will
> 
> don't
> 
> > +	# end up with a tree as:
> > +	#
> > +	#                    +---------+---------+
> > +	#                    | index 1 | index 2 |
> > +	#                    +-----+---+-----+---+
> > +	#                   +------+         +------------+
> > +	#                   |                             |
> > +	#      +-------+-------+---+---------+     +------+-----------+
> > +	#      | ex 1  | ex 2  |   |  ex n   |     |  ex n+1 (merged) |
> > +	#      | off:0 | off:2 |...| off:678 |     |      off:679     |
> > +	#      | len:1 | len:1 |   |  len:1  |     |      len:2       |
> > +	#      +-------+-------+---+---------+     +------------------+
> > +	#
> > +	echo >> $seqres.full
> > +	torn_ex_offset=$((((entries_per_blk * 2) - 1) * bs))
> > +	$XFS_IO_PROG -c "pwrite $torn_ex_offset $bs" $testfile >> /dev/null
> > +	sync $testfile
> > +
> > +	echo >> $seqres.full
> > +	echo "Perform 1 block write at $torn_ex_offset to create torn extent. Extents:">> $seqres.full
> > +	echo "...">> $seqres.full
> > +	$DEBUGFS_PROG -R "ex `basename $testfile`" $SCRATCH_DEV |& tail >> $seqres.full
> > +
> > +	_scratch_cycle_mount
> > +}
> > +
> 
> Out of curiosity, for such a file with split extents, what would filefrag
> output look like? An example would be nice.

Hey John thanks for the review. Sorry for the late reply i had a mini
vacation followed by lei suddenly not pulling emails :/

Anyways, yes I've added the $DEBUGFS command so we can observe the
extent structure, but the filefrag would look something like this (last
few extents):

 ...
 337:      674..     674:      10130..     10130:      1:
 338:      676..     676:      10132..     10132:      1:
 339:      678..     678:      10134..     10134:      1:
 340:      679..     680:      10135..     10136:      2:             last,eof

Notice that the last 2 extents are logically and physically continuous
but not merged.

Reards,
ojaswin

> 
> Thanks,
> John

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ