[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20110412005526.3015a238@natsu>
Date: Tue, 12 Apr 2011 00:55:26 +0600
From: Roman Mamedov <rm@...anrm.ru>
To: Ted Ts'o <tytso@....edu>
Cc: Andreas Dilger <adilger@...ger.ca>, linux-ext4@...r.kernel.org,
linux-raid@...r.kernel.org
Subject: Re: tune2fs can't be used on a mounted ext4, or...?
On Mon, 11 Apr 2011 09:10:08 -0400
Ted Ts'o <tytso@....edu> wrote:
> Your symptoms don't sound familiar to me, other than the standard
> concerns about hardware induced file system inconsistency problems.
Thing is, I do not observe any in-file random data corruptions which would
point to a problem at a lower (block-device) level, so I do not think it is a
RAID or HDD problem.
The breakage seemed to be on the filesystem logic level, perhaps something to
do with allocation of space for new files? And since I immediately just before
that, made two operations possibly affecting it (tune2fs stride size + online
grow with resize2fs) that's why I thought this might be an ext4 problem.
While still in the same session, I then re-copied the affected files replacing
their "shortened" copies, and they were written out fine the second time. And
after a reboot, no more file truncations are observed so far.
> Have you checked your logs carefully to make sure there weren't any
> hardware errors reported?
No, there weren't any errors in dmesg, or on the same console where 'cp' would
output its errors.
> If this is a hardware RAID system, is it regularly doing disk scrubbing?
> Has the hardware RAID reported anything unusual? How long have you been
> running in a degraded RAID 6 state?
It is an mdadm RAID6, and it does not report any problem. It was running in a
degraded state for only a short time (less than a day). And AFAIK running
degraded without one disk is not a dangerous or risky situation with RAID6.
> And have you tried shutting down the system and running fsck to make
> sure there weren't any file system corruption problems? When's the
> last time you've run fsck on the system?
I have unmounted it and ran fsck just now. Admittedly there was a long time
since the last fsck.
# e2fsck /dev/md0
e2fsck 1.41.12 (17-May-2010)
/dev/md0 has gone 306 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md0: 367107/364412928 files (4.3% non-contiguous), 1219229259/1457626752
blocks
> If this is an LVM system, I'd strongly suggest that you set aside
> space you can take a snapshot, and then regularly take a snapshot, and
> then run fsck on the snapshot. If any problems are noted, you can
> then schedule downtime and fsck the entire system.
No, I don't use LVM there.
--
With respect,
Roman
Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)
Powered by blists - more mailing lists