linux-ext4 - Crash consistency bug in ext4 - interaction between delalloc and fzero

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CA+EzBbCyrHW-v9JVZbjhooQkugx3EjXH247kLWC+h1FjBGEDSA@mail.gmail.com>
Date:   Mon, 12 Mar 2018 20:50:02 -0500
From:   Jayashree Mohan <jayashree2912@...il.com>
To:     linux-ext4@...r.kernel.org
Cc:     Vijaychidambaram Velayudhan Pillai <vijay@...utexas.edu>,
        Ashlie Martinez <ashmrtn@...xas.edu>
Subject: Crash consistency bug in ext4 - interaction between delalloc and fzero

Hi,

We've encountered what seems to be a crash consistency bug in
ext4(kernel 4.15) due to the interaction between delayed allocated
write and an unaligned fallocate(zero range). Say we create a disk
image with known data and quick format it.
1. Now write 65K of data to a new file
2. Zero out a part of the above file using falloc_zero_range (60K+128)
- (60K+128+4096) - an unaligned block
3. fsync the above file
4. Crash

If we crash after the fsync, and allow reordering of the block IOs
between two flush/fua commands using Crashmonkey[1], then we can end
up zeroing the file range from (64K+128) to 65K, which should be
untouched by the fallocate command. We expect this region to contain
the  user written data in step 1 above.

This workload was inspired from xfstest/generic_042, which tests for
stale data exposure using aligned fallocate commands. It's worth
noting that f2fs and btrfs passes our test clean - irrespective of the
order of bios, user data is intact in these filesystems.

To reproduce this bug using CrashMonkey, simply run :
./c_harness -f /dev/sda -d /dev/cow_ram0 -t ext4 -e 10240 -s 1000 -v
tests/generic_042/generic_042_fzero_unaligned.so

and take a look at the <timestamp>-generic_042_fzero_unaligned.log
created in the build directory. This file has the list of block IOs
issued during the workload and the permutation of bios that lead to
this bug. You can also verify using blktrace that CrashMonkey only
reorders bios between two barrier operations(thereby such a crash
state could be encountered due to reordering blocks at the storage
stack). Note that tools like dm-log-writes cannot capture this bug
because this arises due to reordering blocks between barrier
operations.

This seems to a bug, as it is zeroing out user data that is ideally
not supposed to be zeroed by the fallocate command.
Let me know if I am missing some detail here.

[1] https://github.com/utsaslab/crashmonkey.git

Thanks,
Jayashree Mohan