[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <542B1220.8020208@bitsync.net>
Date: Tue, 30 Sep 2014 22:27:12 +0200
From: Zlatko Calusic <zcalusic@...sync.net>
To: Theodore Ts'o <tytso@....edu>
CC: "Darrick J. Wong" <darrick.wong@...cle.com>,
linux-ext4@...r.kernel.org
Subject: Re: e2fsck not fixing deleted inode referenced errors?
On 30.09.2014 21:54, Theodore Ts'o wrote:
> On Tue, Sep 30, 2014 at 08:43:04PM +0200, Zlatko Calusic wrote:
>> Full error message from the kernel log, together with data check I did in
>> the evening:
>>
>> Sep 29 05:07:51 atlas kernel: ata2.00: exception Emask 0x10 SAct 0x0 SErr
>> 0x4010000 action 0xe frozen
>> Sep 29 05:07:51 atlas kernel: ata2.00: irq_stat 0x00400040, connection
>> status changed
>> Sep 29 05:07:51 atlas kernel: ata2: SError: { PHYRdyChg DevExch }
>> Sep 29 05:07:51 atlas kernel: ata2.00: failed command: FLUSH CACHE EXT
>> Sep 29 05:07:51 atlas kernel: ata2.00: cmd
>> ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0\x0a res
>> 40/00:f4:e2:7f:14/00:00:3a:00:00/40 Emask 0x10 (ATA bus error)
>> Sep 29 05:07:51 atlas kernel: ata2.00: status: { DRDY }
>> Sep 29 05:07:51 atlas kernel: ata2: hard resetting link
>> Sep 29 05:07:57 atlas kernel: ata2: link is slow to respond, please be
>> patient (ready=0)
>> Sep 29 05:08:00 atlas kernel: ata2: SATA link up 3.0 Gbps (SStatus 123
>> SControl 300)
>> Sep 29 05:08:00 atlas kernel: ata2.00: configured for UDMA/133
>> Sep 29 05:08:00 atlas kernel: ata2.00: retrying FLUSH 0xea Emask 0x10
>> Sep 29 05:08:00 atlas kernel: ata2: EH complete
>
> That looks really bad; it sounds like you have a hardware error on at
> least one of your disks. Have you tried running running badblocks on
> both disks to make sure the disk isn't flagging more bad blocks, and
> then resynchronizing the RAID 1 array? Then try running e2fsck again.
>
Yep, both disks are pretty old, somewhere at the end of warranty. Yet
the interesting thing is that exactly that error (FLUSH CACHE EXT)
happened from time to time, say once a year, but never before I got in
such trouble that e2fsck wouldn't save the day after one quick run.
I now remember Darrick also asked for smartctl data. Here it is:
/dev/sda
========
Power_On_Hours 40984
and only 2 SMART READ/WRITE LOG errors in the log from long time ago...
ATA Error Count: 2
Error 1 occurred at disk power-on lifetime: 14493 hours (603 days + 21
hours)
Error 2 occurred at disk power-on lifetime: 14493 hours (603 days + 21
hours)
Full: http://pastebin.com/GnQhACXf
/dev/sdb (I believe the disk responsible for the problem)
========
Power_On_Hours 40978
No Errors Logged
Full: http://pastebin.com/nUB2q0Tk
Unless you have other ideas, I will run badblocks. Although, as ext4 fs
is on /dev/md2, I think I should run it on /dev/md2 only? Do you really
mean to run it on /dev/sda2, /dev/sdb2 - underlying devices? I'm not
sure how MD would cope with it.
But, I'm pretty sure that it will come out clean. The md check I did
last night would surely detected bad blocks if there were any. Or not?
Thanks for your help!
--
Zlatko
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists