[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.00.0809301946020.16014@filesrv1.baby-dragons.com>
Date: Tue, 30 Sep 2008 19:50:09 -0800 (AKDT)
From: "Mr. James W. Laferriere" <babydr@...y-dragons.com>
To: Justin Piszcz <jpiszcz@...idpixels.com>
cc: Tom Mortensen <tmmlkml@...il.com>, Tejun Heo <tj@...nel.org>,
Bill Davidsen <davidsen@....com>,
Gwendal Grignou <gwendal@...gle.com>,
Brian Rademacher <rad@...files.net>, linux-ide@...r.kernel.org,
linux-raid maillist <linux-raid@...r.kernel.org>,
Linux Kernel Maillist <linux-kernel@...r.kernel.org>,
Bruce Allen <ballen@...vity.phys.uwm.edu>
Subject: Re: exception Emask 0x0 SAct 0x1 / SErr 0x0 action 0x2 frozen
Hello Justin ,
On Tue, 30 Sep 2008, Justin Piszcz wrote:
> On Tue, 30 Sep 2008, Tom Mortensen wrote:
>
>> Don't know if this is the original poster's problem, but if the drive
>> is spun down, then enabling SMART or trying to read SMART attributes
>> causes the drive to spin up and the command is delayed until this has
>> occurred.
>>
>> The fix is to increase the timeout given to scsi_execute() in
>> drivers/ata/libata-scsi.c.
>>
>> ie, current code (2.6.26.5) is:
>>
>> /* Good values for timeout and retries? Values below
>> from scsi_ioctl_send_command() for default case... */
>> cmd_result = scsi_execute(scsidev, scsi_cmd, data_dir, argbuf,
>> argsize,
>> sensebuf, (10*HZ), 5, 0);
>>
>> Should be changed to:
>>
>> /* Good values for timeout and retries? Values below
>> from scsi_ioctl_send_command() for default case... */
>> cmd_result = scsi_execute(scsidev, scsi_cmd, data_dir, argbuf,
>> argsize,
>> sensebuf, (30*HZ), 5, 0);
>>
>> Using a 1TB Hitachi hard drive, this command times out because it
>> takes this drive about 15 seconds to spin up. Virtutally all hard
>> drives spin up in less than 30 sec, but perhaps make this higher in
>> case there are slower drives out there?
>>
>> Cheers,
>> Tom
>
> Velociraptor 10k drive here (2.6.26.5):
>
> Sep 30 15:55:06 p34 kernel: [420781.333179] ata6.00: exception Emask 0x0 SAct
> 0x0 SErr 0x0 action 0x6 frozen
> Sep 30 15:55:06 p34 kernel: [420781.333189] ata6.00: cmd
> b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
> Sep 30 15:55:06 p34 kernel: [420781.333190] res
> 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> Sep 30 15:55:06 p34 kernel: [420781.333194] ata6.00: status: { DRDY }
> Sep 30 15:55:06 p34 kernel: [420781.333200] ata6: hard resetting link
> Sep 30 15:55:06 p34 kernel: [420781.638589] ata6: SATA link up 3.0 Gbps
> (SStatus
> 123 SControl 300)
> Sep 30 15:55:06 p34 kernel: [420781.662166] ata6.00: configured for UDMA/133
> Sep 30 15:55:06 p34 kernel: [420781.669416] sd 5:0:0:0: [sdf] Write Protect
> is
> off
> Sep 30 15:55:06 p34 kernel: [420781.669416] sd 5:0:0:0: [sdf] Mode Sense: 00
> 3a
> 00 00
> Sep 30 15:55:06 p34 kernel: [420781.669416] sd 5:0:0:0: [sdf] Write cache:
> enabled, read cache: enabled, doesn't support DPO or FUA
>
> Nothing wrong with the disk, it just happens... :( Linux/kernel bug?
> It happens on multiple controllers, Intel, SiI, Marvell, does not seem to
> matter.
>
> SMART Self-test log structure revision number 1
> Num Test_Description Status Remaining LifeTime(hours)
> LBA
> _of_first_error
> # 1 Short offline Completed without error 00% 2761
> -
> # 2 Short offline Completed without error 00% 2737
> -
> # 3 Extended offline Completed without error 00% 2714
> -
> # 4 Short offline Completed without error 00% 2689
> -
> # 5 Extended offline Completed without error 00% 2514
> -
> # 6 Short offline Completed without error 00% 2306
> -
> # 7 Short offline Completed without error 00% 2282
> -
> # 8 Short offline Completed without error 00% 2258
> -
> # 9 Short offline Completed without error 00% 2234
> -
> #10 Extended offline Completed without error 00% 2211
> -
> #11 Short offline Completed without error 00% 2186
> -
> #12 Short offline Completed without error 00% 2138
> -
> #13 Short offline Completed without error 00% 2114
> -
> #14 Short offline Completed without error 00% 2090
> -
> #15 Short offline Completed without error 00% 2066
> -
> #16 Extended offline Completed without error 00% 2043
> -
> #17 Short offline Completed without error 00% 2018
> -
> #18 Short offline Completed without error 00% 1970
> -
> #19 Short offline Completed without error 00% 1947
> -
> #20 Short offline Completed without error 00% 1923
> -
> #21 Short offline Completed without error 00% 1899
> -
>
>
> Justin.
I take it you've tried differant drive manufacturers ?
Or even a differant drive of same manuf. ?
Seeing as you've moved this same drive(?) across several chipsets &
possibly mother boards , Leads me to beleive that the difficulty is either with
the driver or the drive (if it is always the same drive or drive model) .
Hth , JimL
--
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network&System Engineer | 2133 McCullam Ave | Give me Linux |
| babydr@...y-dragons.com | Fairbanks, AK. 99701 | only on AXP |
+------------------------------------------------------------------+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists