lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+EzBbCyrHW-v9JVZbjhooQkugx3EjXH247kLWC+h1FjBGEDSA@mail.gmail.com>
Date:   Mon, 12 Mar 2018 20:50:02 -0500
From:   Jayashree Mohan <jayashree2912@...il.com>
To:     linux-ext4@...r.kernel.org
Cc:     Vijaychidambaram Velayudhan Pillai <vijay@...utexas.edu>,
        Ashlie Martinez <ashmrtn@...xas.edu>
Subject: Crash consistency bug in ext4 - interaction between delalloc and fzero

Hi,

We've encountered what seems to be a crash consistency bug in
ext4(kernel 4.15) due to the interaction between delayed allocated
write and an unaligned fallocate(zero range). Say we create a disk
image with known data and quick format it.
1. Now write 65K of data to a new file
2. Zero out a part of the above file using falloc_zero_range (60K+128)
- (60K+128+4096) - an unaligned block
3. fsync the above file
4. Crash

If we crash after the fsync, and allow reordering of the block IOs
between two flush/fua commands using Crashmonkey[1], then we can end
up zeroing the file range from (64K+128) to 65K, which should be
untouched by the fallocate command. We expect this region to contain
the  user written data in step 1 above.

This workload was inspired from xfstest/generic_042, which tests for
stale data exposure using aligned fallocate commands. It's worth
noting that f2fs and btrfs passes our test clean - irrespective of the
order of bios, user data is intact in these filesystems.

To reproduce this bug using CrashMonkey, simply run :
./c_harness -f /dev/sda -d /dev/cow_ram0 -t ext4 -e 10240 -s 1000 -v
tests/generic_042/generic_042_fzero_unaligned.so

and take a look at the <timestamp>-generic_042_fzero_unaligned.log
created in the build directory. This file has the list of block IOs
issued during the workload and the permutation of bios that lead to
this bug. You can also verify using blktrace that CrashMonkey only
reorders bios between two barrier operations(thereby such a crash
state could be encountered due to reordering blocks at the storage
stack). Note that tools like dm-log-writes cannot capture this bug
because this arises due to reordering blocks between barrier
operations.

This seems to a bug, as it is zeroing out user data that is ideally
not supposed to be zeroed by the fallocate command.
Let me know if I am missing some detail here.

[1] https://github.com/utsaslab/crashmonkey.git

Thanks,
Jayashree Mohan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ