linux-ext4 - Re: ext4 fix for interaction between i_size, fallocate, and delalloc after a crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Thu, 30 Nov 2017 08:51:45 -0600
From:   Ashlie Martinez <ashmrtn@...xas.edu>
To:     "Theodore Ts'o" <tytso@....edu>
Cc:     Amir Goldstein <amir73il@...il.com>,
        Vijay Chidambaram <vvijay03@...il.com>,
        Ext4 <linux-ext4@...r.kernel.org>
Subject: Re: ext4 fix for interaction between i_size, fallocate, and delalloc
 after a crash

On Wed, Nov 29, 2017 at 10:46 PM, Theodore Ts'o <tytso@....edu> wrote:
> On Wed, Nov 29, 2017 at 07:46:08PM -0600, Ashlie Martinez wrote:
>> > 5.  Since I'm too lazy to wait 120 seconds, just force everything to disk:
>> >
>> >         sync
>>
>> I believe you said in an earlier email that sync would erase any trace
>> of the bug Amir found as it resolves the delayed allocation.
>
> Right, but you're waiting 120 seconds, which is enough time that it
> would resolve the delayed allocation.  So that's why I was trying to
> replicate your Crashmonkey experiment.
>
> And since you stated that you waited 120 seconds as the last step,
> there should have been a barrier, and no I/O operations for
> Crashmonkey to rearrange.  This is why I believe what I listed should
> be exactly the same as your Crashmonkey test, if I understood it
> correctly.
>
> What you said was that you ran the following operations on the test
> file:
>
> 1.      write 0x137dd 0xdc69 0x0
> 2.      fallocate 0xb531 0xb5ad 0x21446
> 3.      collapse_range 0x1c000 0x4000 0x21446
>         <sleep 30>
> 4.      write 0x3e5ec 0x1a14 0x21446
> 5.      zero_range 0x20fac 0x6d9c 0x40000 keep_size
>         <sleep 120>
>
> So what was the block I/O trace?  What operations was crashmonkey
> actually reordering?  There **really** shouldn't have been any....
>
> Can you send the block I/O trace that was observed when you did the
> following, a complete output of dumpe2fs on the file system, and a
> debugfs stat output on the test file?

I'll work on getting you traces, or other information with which you
can recreate the crash with later today or tomorrow. Unfortunately I'm
a bit slammed with work right now (undergrad honors thesis, end of
semester projects, etc.), but I'll try to get it out soon :)

>
>                                                 - Ted
>
> P.S.  I did read the Crashmonkey paper, so I'm aware of what
> Crashmonkey does.  I'm just confused about your workload, since the
> 120 second sleep should have meant there was a barrier followed by
> nothing else, making for a *very* boring crashmonkey replay.
>

Even though CrashMonkey *records* all the disk operations, it doesn't
have to replay all of them when generating crash states. For example,
it could choose to fully replay (and preserve the ordering of) all
operations before the 3rd barrier operation in a trace with 5
different barrier operations in it (we dub each set of operations from
just after the previous barrier operation up to and including the next
barrier operation a "disk write epoch" or "disk epoch"). If
CrashMonkey decides to write out a partial disk epoch at the end, it
can rearrange and/or drop operations from it according to the rules
laid out in my earlier message. This is more or less equivalent to
running dm-log-writes, placing a mark at each barrier operation it
sees, and then replaying up to each mark (and potentially a little
past the mark) to generate a disk image to check.