linux-kernel - Re: exception Emask 0x0 SAct 0x1 / SErr 0x0 action 0x2 frozen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LNX.2.00.0809301946020.16014@filesrv1.baby-dragons.com>
Date:	Tue, 30 Sep 2008 19:50:09 -0800 (AKDT)
From:	"Mr. James W. Laferriere" <babydr@...y-dragons.com>
To:	Justin Piszcz <jpiszcz@...idpixels.com>
cc:	Tom Mortensen <tmmlkml@...il.com>, Tejun Heo <tj@...nel.org>,
	Bill Davidsen <davidsen@....com>,
	Gwendal Grignou <gwendal@...gle.com>,
	Brian Rademacher <rad@...files.net>, linux-ide@...r.kernel.org,
	linux-raid maillist <linux-raid@...r.kernel.org>,
	Linux Kernel Maillist <linux-kernel@...r.kernel.org>,
	Bruce Allen <ballen@...vity.phys.uwm.edu>
Subject: Re: exception Emask 0x0 SAct 0x1 / SErr 0x0 action 0x2 frozen

 	Hello Justin ,

On Tue, 30 Sep 2008, Justin Piszcz wrote:
> On Tue, 30 Sep 2008, Tom Mortensen wrote:
>
>> Don't know if this is the original poster's problem, but if the drive
>> is spun down, then enabling SMART or trying to read SMART attributes
>> causes the drive to spin up and the command is delayed until this has
>> occurred.
>> 
>> The fix is to increase the timeout given to scsi_execute() in
>> drivers/ata/libata-scsi.c.
>> 
>> ie, current code (2.6.26.5) is:
>>
>>        /* Good values for timeout and retries?  Values below
>>           from scsi_ioctl_send_command() for default case... */
>>        cmd_result = scsi_execute(scsidev, scsi_cmd, data_dir, argbuf, 
>> argsize,
>>                                  sensebuf, (10*HZ), 5, 0);
>> 
>> Should be changed to:
>>
>>        /* Good values for timeout and retries?  Values below
>>           from scsi_ioctl_send_command() for default case... */
>>        cmd_result = scsi_execute(scsidev, scsi_cmd, data_dir, argbuf, 
>> argsize,
>>                                  sensebuf, (30*HZ), 5, 0);
>> 
>> Using a 1TB Hitachi hard drive, this command times out because it
>> takes this drive about 15 seconds to spin up.  Virtutally all hard
>> drives spin up in less than 30 sec, but perhaps make this higher in
>> case there are slower drives out there?
>> 
>> Cheers,
>> Tom
>
> Velociraptor 10k drive here (2.6.26.5):
>
> Sep 30 15:55:06 p34 kernel: [420781.333179] ata6.00: exception Emask 0x0 SAct
> 0x0 SErr 0x0 action 0x6 frozen
> Sep 30 15:55:06 p34 kernel: [420781.333189] ata6.00: cmd
> b0/d8:00:00:4f:c2/00:00:00:00:00/00 tag 0
> Sep 30 15:55:06 p34 kernel: [420781.333190]          res
> 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
> Sep 30 15:55:06 p34 kernel: [420781.333194] ata6.00: status: { DRDY }
> Sep 30 15:55:06 p34 kernel: [420781.333200] ata6: hard resetting link
> Sep 30 15:55:06 p34 kernel: [420781.638589] ata6: SATA link up 3.0 Gbps 
> (SStatus
> 123 SControl 300)
> Sep 30 15:55:06 p34 kernel: [420781.662166] ata6.00: configured for UDMA/133
> Sep 30 15:55:06 p34 kernel: [420781.669416] sd 5:0:0:0: [sdf] Write Protect 
> is
> off
> Sep 30 15:55:06 p34 kernel: [420781.669416] sd 5:0:0:0: [sdf] Mode Sense: 00 
> 3a
> 00 00
> Sep 30 15:55:06 p34 kernel: [420781.669416] sd 5:0:0:0: [sdf] Write cache:
> enabled, read cache: enabled, doesn't support DPO or FUA
>
> Nothing wrong with the disk, it just happens... :(  Linux/kernel bug?
> It happens on multiple controllers, Intel, SiI, Marvell, does not seem to
> matter.
>
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining  LifeTime(hours) 
> LBA
> _of_first_error
> # 1  Short offline       Completed without error       00%      2761 
> -
> # 2  Short offline       Completed without error       00%      2737 
> -
> # 3  Extended offline    Completed without error       00%      2714 
> -
> # 4  Short offline       Completed without error       00%      2689 
> -
> # 5  Extended offline    Completed without error       00%      2514 
> -
> # 6  Short offline       Completed without error       00%      2306 
> -
> # 7  Short offline       Completed without error       00%      2282 
> -
> # 8  Short offline       Completed without error       00%      2258 
> -
> # 9  Short offline       Completed without error       00%      2234 
> -
> #10  Extended offline    Completed without error       00%      2211 
> -
> #11  Short offline       Completed without error       00%      2186 
> -
> #12  Short offline       Completed without error       00%      2138 
> -
> #13  Short offline       Completed without error       00%      2114 
> -
> #14  Short offline       Completed without error       00%      2090 
> -
> #15  Short offline       Completed without error       00%      2066 
> -
> #16  Extended offline    Completed without error       00%      2043 
> -
> #17  Short offline       Completed without error       00%      2018 
> -
> #18  Short offline       Completed without error       00%      1970 
> -
> #19  Short offline       Completed without error       00%      1947 
> -
> #20  Short offline       Completed without error       00%      1923 
> -
> #21  Short offline       Completed without error       00%      1899 
> -
>
>
> Justin.
 	I take it you've tried differant drive manufacturers ?
 	Or even a differant drive of same manuf. ?
 	Seeing as you've moved this same drive(?) across several chipsets & 
possibly mother boards ,  Leads me to beleive that the difficulty is either with 
the driver or the drive (if it is always the same drive or drive model) .

 		Hth ,  JimL
-- 
+------------------------------------------------------------------+
| James   W.   Laferriere | System    Techniques | Give me VMS     |
| Network&System Engineer | 2133    McCullam Ave |  Give me Linux  |
| babydr@...y-dragons.com | Fairbanks, AK. 99701 |   only  on  AXP |
+------------------------------------------------------------------+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/