lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 15 Sep 2008 13:43:30 -0700
From:	Tejun Heo <tj@...nel.org>
To:	Bruno Prémont <bonbons@...ux-vserver.org>
CC:	Linux Kernel <linux-kernel@...r.kernel.org>,
	linux-ide@...r.kernel.org, Jeff Garzik <jgarzik@...ox.com>
Subject: Re: XFS shutting down due to IO timeout on SATA disk (pata_via for
 CX700)

Hello,

Bruno Prémont wrote:
> On Mon, 15 September 2008 Tejun Heo <tj@...nel.org> wrote:
>> (please try to wrap paragraphs for 80 column)
> I try not to break lines from dmesg, lspci and and other commands'
> (formatted) output as those tend to get pretty hard to read when
> line-wrapped.  Sorry if I wrapped my text after 80 columns.

Yeap, I was talking only about the text.  Not wrapping outputs and
code snippets is definitely better.

>> Timeout on FLUSH_EXT.  That's a bad sign.  Patch to retry FLUSH is
>> pending but at any rate FLUSH failure is often accompanied by loss of
>> data and XFS is doing the right thing of giving up on it.
>>
>> Can you please post the result of "smartctl -a /dev/sda"?
> I checked it though there were no errors logged nor any other information
> that would catch attention. The disk/machine is pretty unused (a year old
> but low uptime, a few hours those days with uptime)
> 
> Anyhow smaprtctl's output is blow.
> 

>   5 Reallocated_Sector_Ct   0x0033   100   100   024    Pre-fail  Always       -       8589934592000

Whee... That's unusally high realloc count but I'm not sure whether it
indicates actual problem or it's just the drive's way of saying I'm
okay.  But this does look quite suspicious.

> 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       441778176

Hmmm.. Do you happen to have drives of the same model?  If so, can you
please check what other drives are reporting?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ