lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 26 Dec 2022 20:11:38 +0800
From:   Qu Wenruo <quwenruo.btrfs@....com>
To:     Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Cc:     wqu@...e.com, dsterba@...e.com,
        Btrfs BTRFS <linux-btrfs@...r.kernel.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [6.2][regression] after commit
 947a629988f191807d2d22ba63ae18259bb645c5 btrfs volume periodical forced
 switch to readonly after a lot of disk writes



On 2022/12/26 19:14, Mikhail Gavrilov wrote:
> On Mon, Dec 26, 2022 at 1:47 PM Qu Wenruo <quwenruo.btrfs@....com> wrote:
>>
>>
>> With the check output, it's indeed a runtime error.
>> (At least no corruption to your fs)
>>
>> And it can be some call paths not properly initializing the level to check.
>>
>> Here is the new debug patch.
>> It should be applied without any previous debug patch.
>>
> 
> yeah, gotcha trace
> 
> [ 2561.764141] BTRFS error (device nvme1n1p3): level verify failed on
> logical 45170688 mirror 1 wanted 0 found 1
> [ 2561.764220] BTRFS error (device nvme1n1p3): read time tree block
> corruption detected on logical 45170688 mirror 1
> [ 2561.764224] BTRFS error (device nvme1n1p3): check owner_root=0
> transid=0 first_key=(0 0 0) has_first_key=0 level=0

Damn it, some check structure is completely uninitialized.

> [ 2561.764242] BTRFS error (device nvme1n1p3): read failed, check
> owner_root=259 transid=650936 has_first_key=1 first_key=(2221294 1 0)
> level=1

While at the caller, the structure is properly passed in.

So there is something wrong between the endio function and the check.

I have created the v2 version patch to debug, please apply without any 
previous debug patch.

Meanwhile this really looks like a race, thus I'm not 100% sure if my 
debug patch would reduce the possibility to reproduce.

Your bisect should be the determining evidence, for the worst case we 
can revert the offending patch.

Thank you very much for all of the testing, it really helps a lot.
Qu

> [ 2561.764342] CPU: 8 PID: 727 Comm: systemd-udevd Tainted: G        W
>     L     6.2.0-rc1-1b929c02afd37871d5afb9d498426f83432e71c2-btrfs-debug2+
> #3
> [ 2561.764348] Hardware name: ASUSTeK COMPUTER INC. ROG Strix
> G513QY_G513QY/G513QY, BIOS G513QY.320 09/07/2022
> [ 2561.764352] Call Trace:
> [ 2561.764355]  <TASK>
> [ 2561.764360]  dump_stack_lvl+0x5b/0x77
> [ 2561.764369]  read_extent_buffer_pages.cold+0x5a/0x9d
> [ 2561.764382]  ? __pfx_end_bio_extent_readpage+0x10/0x10
> [ 2561.764390]  ? folio_unlock+0x1d/0x50
> [ 2561.764400]  btrfs_read_extent_buffer+0x98/0x150
> [ 2561.764410]  read_tree_block+0x43/0xa0
> [ 2561.764417]  read_block_for_search+0x218/0x320
> [ 2561.764432]  btrfs_search_slot+0x351/0xd30
> [ 2561.764450]  btrfs_lookup_xattr+0x7b/0xe0
> [ 2561.764460]  btrfs_getxattr+0x55/0x110
> [ 2561.764470]  __vfs_getxattr+0x7e/0xb0
> [ 2561.764480]  get_vfs_caps_from_disk+0x7a/0x220
> [ 2561.764492]  audit_copy_inode+0x95/0xd0
> [ 2561.764501]  filename_lookup+0x16b/0x1c0
> [ 2561.764525]  ? _raw_spin_unlock+0x29/0x50
> [ 2561.764534]  vfs_statx+0x8e/0x150
> [ 2561.764545]  vfs_fstatat+0x51/0x70
> [ 2561.764553]  __do_sys_newfstatat+0x2e/0x50
> [ 2561.764571]  do_syscall_64+0x5b/0x80
> [ 2561.764576]  ? do_syscall_64+0x67/0x80
> [ 2561.764585]  ? up_read+0x17/0x20
> [ 2561.764593]  ? lock_is_held_type+0xe8/0x140
> [ 2561.764601]  ? asm_exc_page_fault+0x22/0x30
> [ 2561.764607]  ? lockdep_hardirqs_on+0x7d/0x100
> [ 2561.764613]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
> [ 2561.764618] RIP: 0033:0x7f7b1b91e05e
> [ 2561.764639] Code: 48 89 f2 b9 00 01 00 00 48 89 fe bf 9c ff ff ff
> e9 07 00 00 00 0f 1f 80 00 00 00 00 f3 0f 1e fa 41 89 ca b8 06 01 00
> 00 0f 05 <3d> 00 f0 ff ff 77 0b 31 c0 c3 0f 1f 84 00 00 00 00 00 48 8b
> 15 71
> [ 2561.764643] RSP: 002b:00007fffdc4062c8 EFLAGS: 00000246 ORIG_RAX:
> 0000000000000106
> [ 2561.764649] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b1b91e05e
> [ 2561.764653] RDX: 00007fffdc406320 RSI: 000055efe7d04bc0 RDI: 00000000ffffff9c
> [ 2561.764656] RBP: 000055efe7e79668 R08: 00000000000000f0 R09: 000055efe7ea85b0
> [ 2561.764659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffdc4064b0
> [ 2561.764663] R13: 00007fffdc406320 R14: 00007fffdc4064a8 R15: 00007fffdc406308
> [ 2561.764680]  </TASK>
> [ 2561.764684] BTRFS error (device nvme1n1p3): read failed, check
> owner_root=259 transid=650936 has_first_key=1 first_key=(2221294 1 0)
> level=1
> 
> Of course full logs are also attached.
> Thanks.
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ