lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABXGCsNzVxo4iq-tJSGm_kO1UggHXgq6CdcHDL=z5FL4njYXSQ@mail.gmail.com>
Date:   Mon, 26 Dec 2022 02:32:42 +0500
From:   Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
To:     wqu@...e.com, dsterba@...e.com,
        Btrfs BTRFS <linux-btrfs@...r.kernel.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: [6.2][regression] after commit 947a629988f191807d2d22ba63ae18259bb645c5
 btrfs volume periodical forced switch to readonly after a lot of disk writes

Hi,
It is curious but it happens only on machine which have BTRFS volume
combined from two high speed nvme (pcie 4) SSD in RAID 0. On machines
with BTRFS volume from one HDD the bug does not appear.

To bisect the problematic commit, I had to sweat a lot. At each step,
I downloaded the 150 GB game "Assassin's Creed Valhalla" 4 times and
deleted it. For make sure that the commit previous to
947a629988f191807d2d22ba63ae18259bb645c5 is definitely not affected by
the bug, I downloaded this game 10 times, which should have provided
more than 1.5 Tb of data writing to the btrfs volume.

Here is result of my bisection:
947a629988f191807d2d22ba63ae18259bb645c5 is the first bad commit
commit 947a629988f191807d2d22ba63ae18259bb645c5
Author: Qu Wenruo <wqu@...e.com>
Date:   Wed Sep 14 13:32:51 2022 +0800

    btrfs: move tree block parentness check into validate_extent_buffer()

    [BACKGROUND]
    Although both btrfs metadata and data has their read time verification
    done at endio time (btrfs_validate_metadata_buffer() and
    btrfs_verify_data_csum()), metadata has extra verification, mostly
    parentness check including first key/transid/owner_root/level, done at
    read_tree_block() and btrfs_read_extent_buffer().

    On the other hand, all the data verification is done at endio context.

    [ENHANCEMENT]
    This patch will make a new union in btrfs_bio, taking the space of the
    old data checksums, thus it will not increase the memory usage.

    With that extra btrfs_tree_parent_check inside btrfs_bio, we can just
    pass the check parameter into read_extent_buffer_pages(), and before
    submitting the bio, we can copy the check structure into btrfs_bio.

    And finally at endio time, we can grab btrfs_bio::parent_check and pass
    it to validate_extent_buffer(), to move the remaining checks into it.

    This brings the following benefits:

    - Much simpler btrfs_read_extent_buffer()
      Now it only needs to iterate through all mirrors.

    - Simpler read-time transid check
      Previously we go verify_parent_transid() after reading out the extent
      buffer.
      Now the transid check is done inside the endio function, no other
      code can modify the content.
      Thus no need to use the extent lock anymore.

    Signed-off-by: Qu Wenruo <wqu@...e.com>
    Signed-off-by: David Sterba <dsterba@...e.com>

 fs/btrfs/disk-io.c   | 73 ++++++++++++++++++++++++++++++++++++++--------------
 fs/btrfs/extent_io.c | 18 ++++++++++---
 fs/btrfs/extent_io.h |  5 ++--
 fs/btrfs/volumes.h   | 25 +++++++++++++++---
 4 files changed, 93 insertions(+), 28 deletions(-)

Before going to readonly, the preceding line in kernel log display a message:
[ 1908.029663] BTRFS: error (device nvme0n1p3: state A) in
btrfs_run_delayed_refs:2147: errno=-5 IO failure

I also attached a full kernel log.

-- 
Best Regards,
Mike Gavrilov.

View attachment "btrfs-issue-dmesg.txt" of type "text/plain" (353123 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ