linux-ext4 - Re: [PATCH 0/2] e2fsprogs: update mkfs defaults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <AC6734D9-917B-419C-9308-606C919D5F5D@dilger.ca>
Date:	Wed, 16 Feb 2011 15:12:50 -0700
From:	Andreas Dilger <adilger@...ger.ca>
To:	Eric Sandeen <sandeen@...hat.com>
Cc:	ext4 development <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 0/2] e2fsprogs: update mkfs defaults

On 2011-02-16, at 11:12, Eric Sandeen wrote:
> Anaconda (the Fedora/RHEL installer) had been "fixing up" extN filesystems it created by setting the max mount count and check interval to 0, as well as adding user_xattr to filesystem mount options.
> 
> As part of their efforts to stop special-casing around upstream defaults, they've removed these changes upstream.
> 
> However, I'd like to at least propose that these changes be made default.

I'd really prefer instead that the "lvcheck" script be included into the distro, instead of changing mke2fs.  That achieves the same end result (periodic scrubbing of the filesystem to look for hidden errors), without introducing boot-time delays.  Given the size of disks today and the undetected bit-error-rate (somewhere around 1/10^15 bits or 12TB), I think it is important that there be automated scrubbing of the filesystem.

I think the best place to put that script would be in the lvm tools (since it is applicable to multiple filesystems), which I think Eric has the most leverage in getting accepted (I've been but I'd be OK including it with e2fsprogs if there is pushback on that.

> The forced fsck often comes at unexpected and inopportune moments, and even enterprise customers are often caught by surprise when this happens.  Because a filesystem with an error condition will be marked as requiring fsck anyway,

Any decent RAID array does background scrubbing for integrity verification, it doesn't just wait until there is an uncorrectable error detected in the block device.  If we can do something proactive to prevent this (i.e. lvcheck run by cron.weekly), it is worthwhile.

I think customers are equally surprised when their server fails (remount-ro/panic) due to the kernel detecting an error that might have been on disk for weeks or months.

> I submit that the time-based and mount-based checks are not particularly useful, and that administrators can schedule fscks on their own time, or tune2fs the enforced intervals if they so choose.

I think you are projecting your own self-enlightenment onto users ;-).  As we see on this list, there are many users that don't even back up their critical data, so IMHO taking out "safe by default" options is a step in the wrong direction.

Attached is my latest version of the lvcheck script, and a default /etc/lvcheck.conf script.  It's been enhanced to include a usage message, command-line option parsing to override default parameters, and the ability to check snapshots of ext3/4 filesystems with an external journal.

Cheers, Andreas

Download attachment "lvcheck" of type "application/octet-stream" (12575 bytes)

Download attachment "lvcheck.conf" of type "application/octet-stream" (1213 bytes)