linux-ext4 - Re: [PATCH] ext4: fix interaction between i_size, fallocate, and delalloc after a crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOQ4uxiKdRq1KyiEW-d9rp+hWQanVzMXQa1vW2JHmhtEVM1-GA@mail.gmail.com>
Date:   Tue, 17 Oct 2017 00:11:40 +0300
From:   Amir Goldstein <amir73il@...il.com>
To:     Ashlie Martinez <ashmrtn@...xas.edu>
Cc:     Xiao Yang <yangx.jy@...fujitsu.com>, Theodore Tso <tytso@....edu>,
        Eryu Guan <eguan@...hat.com>, Josef Bacik <jbacik@...com>,
        fstests <fstests@...r.kernel.org>,
        Ext4 <linux-ext4@...r.kernel.org>,
        Vijay Chidambaram <vvijay03@...il.com>
Subject: Re: [PATCH] ext4: fix interaction between i_size, fallocate, and
 delalloc after a crash

On Mon, Oct 16, 2017 at 10:32 PM, Ashlie Martinez <ashmrtn@...xas.edu> wrote:
> Amir,
>
> I know this is a bit late, but I've spent some time working through
> the disk image that you provided (so that I could determine how/if I
> could modify CrashMonkey to catch errors like this) and I don't think
> I understand what state the disk image reflects.

The disk image SHOULD reflect a state on a disk after the power was
cut in the middle of mounted fs. Then power came back on, filesystem
was mounted, journal recovered, then filesystem was cleanly unmounted.
At this stage, I don't expect there should be anything interesting in the
journal.

> After digging around
> the journal of the disk image you provided, I found that the first 10
> journal blocks are used, with the journal superblock being placed in
> the very first block of the journal. The journal superblock says that
> the first journal transaction ID that should be in the journal is
> transaction ID 4. However, dumping the other journal blocks, I found
> that the next block is a descriptor block for transaction ID 2. The
> rest of the journal blocks are data blocks for that transaction plus a
> transaction commit block. This seems a little odd considering that the
> journal refers to a 4th transaction, which I have not been able to
> find (I quickly dumped the first 50 blocks in debugfs and found the
> rest to contain only zeros).
>

I did not spend time analyzing the image, so I'll take your word for it,
but I can't help you understand your findings.

> With this in mind, I looked back at the xfstests code for controlling
> the dm_flakey device. What I realized is the `nolockfs` flag is
> provided both when it switches from the real device to the flakey
> device that drops writes and when it switches from the flakey device
> back to the real device. I know there is a call to umount once the
> flakey device that drops writes is inserted, but do you think it is
> possible that the flakey device is swapped back to the real device
> before all the writes forced out by umount have made it to the flakey
> device?

I believe umount call should be blocked until all writes have been flushed
out to flakey device.

> Unfortunately I still don't have a local machine that is
> capable of reproducing your test results and I have not made any gce
> test appliance images to test this yet, so I'm not sure if this is a
> valid theory.
>

Ted explained that the bug related to very specific timing of flusher
thread vs. fallocate thread.
I was under the impression that CrashMonkey can only reorder writes
between recorded FLUSH requests, so I am not really sure how you intent to
modify CrashMonkey to catch this bug.

Cheers,
Amir.