lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aMpsnQEYagLvPOw2@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com>
Date: Wed, 17 Sep 2025 13:39:01 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: John Garry <john.g.garry@...cle.com>
Cc: Zorro Lang <zlang@...hat.com>, fstests@...r.kernel.org,
        Ritesh Harjani <ritesh.list@...il.com>, djwong@...nel.org,
        tytso@....edu, linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-ext4@...r.kernel.org
Subject: Re: [PATCH v6 09/12] generic: Add sudden shutdown tests for multi
 block atomic writes

On Mon, Sep 15, 2025 at 02:26:46PM +0100, John Garry wrote:
> On 11/09/2025 18:13, Ojaswin Mujoo wrote:
> > This test is intended to ensure that multi blocks atomic writes
> > maintain atomic guarantees across sudden FS shutdowns.
> > 
> > The way we work is that we lay out a file with random mix of written,
> > unwritten and hole extents. Then we start performing atomic writes
> > sequentially on the file while we parallelly shutdown the FS. Then we
> > note the last offset where the atomic write happened just before shut
> > down and then make sure blocks around it either have completely old
> > data or completely new data, ie the write was not torn during shutdown.
> > 
> > We repeat the same with completely written, completely unwritten and completely
> > empty file to ensure these cases are not torn either.  Finally, we have a
> > similar test for append atomic writes
> > 
> > Suggested-by: Ritesh Harjani (IBM) <ritesh.list@...il.com>
> > Reviewed-by: Darrick J. Wong <djwong@...nel.org>
> > Signed-off-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
> 
> I still have some nits, which are close to being the same as last time. I
> don't want this series to be held up any longer over my nitpicking, so:
> 
> Reviewed-by: John Garry <john.g.garry@...cle.com>
> 
> > ---
> >   tests/generic/1230     | 368 +++++++++++++++++++++++++++++++++++++++++
> >   tests/generic/1230.out |   2 +
> >   2 files changed, 370 insertions(+)
> >   create mode 100755 tests/generic/1230
> >   create mode 100644 tests/generic/1230.out
> > 
> > diff --git a/tests/generic/1230 b/tests/generic/1230
> > new file mode 100755
> > index 00000000..28c2c4f5
> > --- /dev/null
> > +++ b/tests/generic/1230
> > @@ -0,0 +1,368 @@
> > +#! /bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +# Copyright (c) 2025 IBM Corporation. All Rights Reserved.
> > +#
> > +# FS QA Test No. 1230
> > +#
> > +# Test multi block atomic writes with sudden FS shutdowns to ensure
> > +# the FS is not tearing the write operation
> > +. ./common/preamble
> > +. ./common/atomicwrites
> > +_begin_fstest auto atomicwrites
> > +
> > +_require_scratch_write_atomic_multi_fsblock
> > +_require_atomic_write_test_commands
> > +_require_scratch_shutdown
> > +_require_xfs_io_command "truncate"
> > +
> > +_scratch_mkfs >> $seqres.full 2>&1
> > +_scratch_mount >> $seqres.full
> > +
> > +testfile=$SCRATCH_MNT/testfile
> > +touch $testfile
> > +
> > +awu_max=$(_get_atomic_write_unit_max $testfile)
> > +blksz=$(_get_block_size $SCRATCH_MNT)
> > +echo "Awu max: $awu_max" >> $seqres.full
> > +
> > +num_blocks=$((awu_max / blksz))
> > +# keep initial value high for dry run. This will be
> > +# tweaked in dry_run() based on device write speed.
> > +filesize=$(( 10 * 1024 * 1024 * 1024 ))
> > +
> > +_cleanup() {
> > +	[ -n "$awloop_pid" ] && kill $awloop_pid &> /dev/null
> > +	wait
> > +}
> > +
> > +atomic_write_loop() {
> > +	local off=0
> > +	local size=$awu_max
> > +	for ((i=0; i<$((filesize / $size )); i++)); do
> > +		# Due to sudden shutdown this can produce errors so just
> > +		# redirect them to seqres.full
> > +		$XFS_IO_PROG -c "open -fsd $testfile" -c "pwrite -S 0x61 -DA -V1 -b $size $off $size" >> /dev/null 2>>$seqres.full
> > +		echo "Written to offset: $off" >> $tmp.aw
> > +		off=$((off + $size))
> > +	done
> > +}
> > +
> > +start_atomic_write_and_shutdown() {
> > +	atomic_write_loop &
> > +	awloop_pid=$!
> > +
> > +	local i=0
> > +	# Wait for atleast first write to be recorded or 10s
> 
> at least
> 
> > +	while [ ! -f "$tmp.aw" -a $i -le 50 ]; do i=$((i + 1)); sleep 0.2; done
> > +
> > +	if [[ $i -gt 50 ]]
> > +	then
> > +		_fail "atomic write process took too long to start"
> > +	fi
> > +
> > +	echo >> $seqres.full
> > +	echo "# Shutting down filesystem while write is running" >> $seqres.full
> > +	_scratch_shutdown
> > +
> > +	kill $awloop_pid 2>/dev/null  # the process might have finished already
> > +	wait $awloop_pid
> > +	unset $awloop_pid
> > +}
> 
> ...
> 
> > +
> > +verify_data_blocks() {
> > +	local verify_start=$1
> > +	local verify_end=$2
> > +	local expected_data_old="$3"
> > +	local expected_data_new="$4"
> > +
> > +	echo >> $seqres.full
> > +	echo "# Checking data integrity from $verify_start to $verify_end" >> $seqres.full
> > +
> > +	# After an atomic write, for every chunk we ensure that the underlying
> > +	# data is either the old data or new data as writes shouldn't get torn.
> > +	local off=$verify_start
> > +	while [[ "$off" -lt "$verify_end" ]]
> > +	do
> > +		#actual_data=$(xxd -s $off -l $awu_max -p $testfile)
> > +		actual_data=$(od -An -t x1 -j $off -N $awu_max $testfile)
> > +		if [[ "$actual_data" != "$expected_data_new" ]] && [[ "$actual_data" != "$expected_data_old" ]]
> > +		then
> > +			echo "Checksum match failed at off: $off size: $awu_max"
> > +			echo "Expected contents: (Either of the 2 below):"
> > +			echo
> > +			echo "Expected old: "
> 
> nit: I think that I mentioned this the last time - I would not use the word
> "expected". We have old data, new data, and actual data. The only thing
> which we expect is that actual data will be either all old or all new.
 
Hey John so I mentioned here [1] that the wording "expected new",
"expected old", "actual" looked more clear to me than "new", "old" and
"actual" and you replied with sure so I though we were good there :)

But no worries I can make this change. I'll keep the wording as 
new, old and actual.

> 
> > +			echo "$expected_data_old"
> > +			echo
> > +			echo "Expected new: "
> > +			echo "$expected_data_new"
> > +			echo
> > +			echo "Actual contents: "
> > +			echo "$actual_data"
> > +
> > +			_fail
> > +		fi
> > +		echo -n "Check at offset $off succeeded! " >> $seqres.full
> > +		if [[ "$actual_data" == "$expected_data_new" ]]
> > +		then
> > +			echo "matched new" >> $seqres.full
> > +		elif [[ "$actual_data" == "$expected_data_old" ]]
> > +		then
> > +			echo "matched old" >> $seqres.full
> > +		fi
> > +		off=$(( off + awu_max ))
> > +	done
> > +}
> > +
> > +# test data integrity for file by shutting down in between atomic writes
> > +test_data_integrity() {
> > +	echo >> $seqres.full
> > +	echo "# Writing atomically to file in background" >> $seqres.full
> > +
> > +	start_atomic_write_and_shutdown
> > +
> > +	last_offset=$(tail -n 1 $tmp.aw | cut -d" " -f4)
> > +	if [[ -z $last_offset ]]
> > +	then
> > +		last_offset=0
> > +	fi
> > +
> > +	echo >> $seqres.full
> > +	echo "# Last offset of atomic write: $last_offset" >> $seqres.full
> > +
> > +	rm $tmp.aw
> > +	sleep 0.5
> > +
> > +	_scratch_cycle_mount
> > +
> > +	# we want to verify all blocks around which the shutdown happened
> > +	verify_start=$(( last_offset - (awu_max * 5)))
> > +	if [[ $verify_start < 0 ]]
> > +	then
> > +		verify_start=0
> > +	fi
> > +
> > +	verify_end=$(( last_offset + (awu_max * 5)))
> > +	if [[ "$verify_end" -gt "$filesize" ]]
> > +	then
> > +		verify_end=$filesize
> > +	fi
> > +}
> > +
> > +# test data integrity for file with written and unwritten mappings
> > +test_data_integrity_mixed() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with mixed mappings" >> $seqres.full
> > +	create_mixed_mappings $testfile $filesize
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mixed" "$expected_data_new"
> > +}
> > +
> > +# test data integrity for file with completely written mappings
> > +test_data_integrity_written() {
> 
> nit: again, I am not so keen on using the word "integrity" at all.
> "integrity" in storage world relates to T10 PI support in Linux. I know that
> last time I mentioned it's ok to use "integrity" when close to words "atomic
> write", but I still fear some doubt on whether we are talking about T10 PI
> when we mention integrity.

Okay got it, fine then how about using phrases like "test for torn
data for file with completely written mapping" and such?

> 
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with fully written mapping" >> $seqres.full
> > +	$XFS_IO_PROG -c "pwrite -b $filesize 0 $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_mapped" "$expected_data_new"
> > +}
> > +
> > +# test data integrity for file with completely unwritten mappings
> > +test_data_integrity_unwritten() {
> > +	$XFS_IO_PROG -fc "truncate 0" $testfile >> $seqres.full
> > +
> > +	echo >> $seqres.full
> > +	echo "# Creating testfile with fully unwritten mappings" >> $seqres.full
> > +	$XFS_IO_PROG -c "falloc 0 $filesize" $testfile >> $seqres.full
> > +	sync $testfile
> > +
> > +	test_data_integrity
> > +
> > +	verify_data_blocks $verify_start $verify_end "$expected_data_old_zeroes" "$expected_data_new"
> > +}
> > +

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ