linux-kernel - BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 37868055...

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAO9zADwMjaMp=TmgkBDHVFxdj5FVHtjTn_6qvFaTcAjpbaDSWg@mail.gmail.com>
Date: Mon, 10 Nov 2025 10:05:42 -0500
From: Justin Piszcz <jpiszcz@...idpixels.com>
To: LKML <linux-kernel@...r.kernel.org>, linux-nvme@...ts.infradead.org, 
	linux-btrfs@...r.kernel.org
Subject: BTRFS error (device nvme1n1p2): bdev /dev/nvme0n1p2 errs: wr 37868055...

Hello,

I am using an ASUS Pro WS W680-ACE motherboard with 2 x Samsung SSD
990 PRO with Heatsink 4TB NVME SSDs with BTRFS R1.  When a BTRFS scrub
was kicked off this morning, suddenly BTRFS was noting errors for one
of the drives.  The system became unusable and I had to power cycle
and re-run the scrub and everything is now OK.  My question is what
would cause this?

Distribution: Debian Stable
Kernel: 6.12.48+deb13-amd64

Drive information (for both drives)
-------------------------------------------------
Drive1:
Model Number:                       Samsung SSD 990 PRO with Heatsink 4TB
Firmware Version:                   4B2QJXD7
Drive2:
Model Number:                       Samsung SSD 990 PRO with Heatsink 4TB
Firmware Version:                   4B2QJXD7

btrfsd scrub configuration:
-------------------------------------------------
stats_interval=1h
scrub_interval=1M
balance_interval=never

Errors:
-------------------------------------------------
Nov 10 02:00:29 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868055, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:29 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868056, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:29 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868057, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:29 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868058, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:29 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868059, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:29 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868060, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:29 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868061, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:30 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868062, rd 39712434, flush 583, corrupt 0,
gen 0
Nov 10 02:00:30 machine1 kernel: BTRFS error (device nvme1n1p2): bdev
/dev/nvme0n1p2 errs: wr 37868063, rd 39712434, flush 583, corrupt 0,
gen 0

Prior to reboot:
-------------------------------------------------
[/dev/nvme0n1p2].write_io_errs    0
[/dev/nvme0n1p2].read_io_errs     0
[/dev/nvme0n1p2].flush_io_errs    0
[/dev/nvme0n1p2].corruption_errs  0
[/dev/nvme0n1p2].generation_errs  0
[/dev/nvme2n1p2].write_io_errs    130766017
[/dev/nvme2n1p2].read_io_errs     137924767
[/dev/nvme2n1p2].flush_io_errs    5054
[/dev/nvme2n1p2].corruption_errs  2216
[/dev/nvme2n1p2].generation_errs  0

After reboot + scrub + clear counters & second scrub:
-------------------------------------------------
[/dev/nvme0n1p2].write_io_errs    0
[/dev/nvme0n1p2].read_io_errs     0
[/dev/nvme0n1p2].flush_io_errs    0
[/dev/nvme0n1p2].corruption_errs  0
[/dev/nvme0n1p2].generation_errs  0
[/dev/nvme2n1p2].write_io_errs    0
[/dev/nvme2n1p2].read_io_errs     0
[/dev/nvme2n1p2].flush_io_errs    0
[/dev/nvme2n1p2].corruption_errs  0
[/dev/nvme2n1p2].generation_errs  0


Smart tests (short/long are showing successful for both drives)
-------------------------------------------------
Drive1:
Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
Num  Test_Description  Status                       Power_on_Hours
Failing_LBA  NSID Seg SCT Code
 0   Extended          Completed without error               16373
       -     -   -   -    -
 1   Short             Completed without error               16373
       -     -   -   -    -
Drive2:
Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
Num  Test_Description  Status                       Power_on_Hours
Failing_LBA  NSID Seg SCT Code
 0   Extended          Completed without error               16369
       -     -   -   -    -
 1   Short             Completed without error               16368
       -     -   -   -    -

Justin