lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20240104043813.GC108362@mit.edu> Date: Wed, 3 Jan 2024 23:38:13 -0500 From: "Theodore Ts'o" <tytso@....edu> To: "Brian J. Murrell" <brian@...erlinx.bc.ca> Cc: linux-ext4@...r.kernel.org Subject: Re: e2scrub finds corruption immediately after mounting On Wed, Jan 03, 2024 at 04:14:36PM -0500, Brian J. Murrell wrote: > I am trying to migrate from lvcheck > (https://github.com/BryanKadzban/lvcheck) to using the officially > supported e2scrub[_all] kit. What distribution are you using, and what version of the kernel are you using? I note that you are using e2fsprogs 1.45.6, and Debian Stable is shipping with e2fsprogs 1.47.0. That being said, this is the first time I've seen any report of an issue like what you've reported.. > # e2scrub /dev/rootvol_tmp/almalinux8_opt > Logical volume "almalinux8_opt.e2scrub" created. > e2fsck 1.45.6 (20-Mar-2020) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > /dev/rootvol_tmp/almalinux8_opt.e2scrub: 1698/178816 files (86.9% non- > contiguous), 482404/716800 blocks > /dev/rootvol_tmp/almalinux8_opt: Scrub FAILED due to corruption! This error means that e2fsck exited with a non-zero exit status. Which is strange because there is no report of any kind of problem from e2fsck in its output. From the e2scrub script: check() { # First we recover the journal, then we see if e2fsck tries any # non-optimization repairs. If either of these two returns a # non-zero status (errors fixed or remaining) then this fs is bad. E2FSCK_FIXES_ONLY=1 export E2FSCK_FIXES_ONLY ${DBG} "@root_sbindir@...fsck" -E journal_only -p ${e2fsck_opts} "${snap_dev}" || return $? ${DBG} "@root_sbindir@...fsck" -f -y ${e2fsck_opts} "${snap_dev}" } ... check case "$?" in "0") # Clean check! echo "${arg}: Scrub succeeded." ... "8") # Operational error, what now? echo "${arg}: e2fsck operational error." ... *) # fsck failed. Check if the snapshot is invalid; if so, make a # note of that at the end of the log. This isn't necessarily a # failure because the mounted fs could have overflowed the # snapshot with regular disk writes /or/ our repair process # could have done it by repairing too much. # # If it's really corrupt we ought to fsck at next boot. is_invalid="$(lvs -o lv_snapshot_invalid --noheadings "${snap_dev}" | awk '{print $1}')" if [ -n "${is_invalid}" ]; then echo "${arg}: Scrub FAILED due to invalid snapshot." ret=8 else echo "${arg}: Scrub FAILED due to corruption! Unmount and run e2fsck -y." mark_corrupt ret=6 fi ... My best guess is that e2fsck from 1.45.6 is somehow returning a non-zero exit status for some reason. So the first thing I'd suggest is upgrading to e2fsprogs 1.47.0 and see if that causes the problem to resolve itself. Cheers, - Ted
Powered by blists - more mailing lists