lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080509005111.7cf55350@werewolf.home>
Date:	Fri, 9 May 2008 00:51:11 +0200
From:	"J.A. Magallón" <jamagallon@....com>
To:	Linus-IDE <linux-ide@...r.kernel.org>,
	Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: The never enfind DRDY saga

Hi all...

I'm getting again my friendly DRDY errors, this time on the new disk I got
to replace the old one (that fortunately was still under warranty...).
This is a brand new disk, it came perfectly packaged...
Now it is not a Seagate, but a WD:

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD3200AVJS-63WDA0
Serial Number:    WD-WCARW2347788
Firmware Version: 12.01B02
User Capacity:    320,072,933,376 bytes
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   8
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Fri May  9 00:33:06 2008 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

I managed to find a reliable way to reproduce the errors.
Copying about 7Gb of data from the SATA to a SCSI disk gave me 3 errors:

ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata3.00: cmd c8/00:00:b7:e9:0f/00:00:00:00:00/e2 tag 0 dma 131072 in
         res 40/00:00:09:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
ata3.00: status: { DRDY }
ata3: soft resetting link
ata3.00: configured for UDMA/133
ata3: EH complete
sd 4:0:0:0: [sdd] 625142448 512-byte hardware sectors (320073 MB)
sd 4:0:0:0: [sdd] Write Protect is off
sd 4:0:0:0: [sdd] Mode Sense: 00 3a 00 00
sd 4:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

I got the SMART status (smartctl -a /dev/sdd > s?) before and after the errors,
and it shows no problem:

werewolf:~# diff -u s1 s2
--- s1  2008-05-09 00:33:06.000000000 +0200
+++ s2  2008-05-09 00:36:02.000000000 +0200
@@ -9,7 +9,7 @@
 Device is:        Not in smartctl database [for details use: -P showall]
 ATA Version is:   8
 ATA Standard is:  Exact ATA specification draft version not indicated
-Local Time is:    Fri May  9 00:33:06 2008 CEST
+Local Time is:    Fri May  9 00:36:01 2008 CEST
 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

@@ -53,16 +53,16 @@
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       78
-  3 Spin_Up_Time            0x0003   207   159   021    Pre-fail  Always       -       2633
-  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       183
+  3 Spin_Up_Time            0x0003   218   159   021    Pre-fail  Always       -       2091
+  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       184
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
   7 Seek_Error_Rate         0x000e   200   200   051    Old_age   Always       -       0
   9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       787
  10 Spin_Retry_Count        0x0012   100   100   051    Old_age   Always       -       0
  11 Calibration_Retry_Count 0x0012   100   100   051    Old_age   Always       -       0
- 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       183
-192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       172
-193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       193
+ 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       184
+192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       173
+193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       194
 194 Temperature_Celsius     0x0022   102   098   000    Old_age   Always       -       45
 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
 197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       36

werewolf:~# diff -u s2 s3
--- s2  2008-05-09 00:36:02.000000000 +0200
+++ s3  2008-05-09 00:37:42.000000000 +0200
@@ -9,7 +9,7 @@
 Device is:        Not in smartctl database [for details use: -P showall]
 ATA Version is:   8
 ATA Standard is:  Exact ATA specification draft version not indicated
-Local Time is:    Fri May  9 00:36:01 2008 CEST
+Local Time is:    Fri May  9 00:37:42 2008 CEST
 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

@@ -54,15 +54,15 @@
 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       78
   3 Spin_Up_Time            0x0003   218   159   021    Pre-fail  Always       -       2091
-  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       184
+  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       185
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
   7 Seek_Error_Rate         0x000e   200   200   051    Old_age   Always       -       0
   9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       787
  10 Spin_Retry_Count        0x0012   100   100   051    Old_age   Always       -       0
  11 Calibration_Retry_Count 0x0012   100   100   051    Old_age   Always       -       0
- 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       184
-192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       173
-193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       194
+ 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       185
+192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       174
+193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       195
 194 Temperature_Celsius     0x0022   102   098   000    Old_age   Always       -       45
 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
 197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       36

werewolf:~# diff -u s3 s4
--- s3  2008-05-09 00:37:42.000000000 +0200
+++ s4  2008-05-09 00:39:12.000000000 +0200
@@ -9,7 +9,7 @@
 Device is:        Not in smartctl database [for details use: -P showall]
 ATA Version is:   8
 ATA Standard is:  Exact ATA specification draft version not indicated
-Local Time is:    Fri May  9 00:37:42 2008 CEST
+Local Time is:    Fri May  9 00:39:12 2008 CEST
 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled

@@ -53,16 +53,16 @@
 Vendor Specific SMART Attributes with Thresholds:
 ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       78
-  3 Spin_Up_Time            0x0003   218   159   021    Pre-fail  Always       -       2091
-  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       185
+  3 Spin_Up_Time            0x0003   223   159   021    Pre-fail  Always       -       1833
+  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       186
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
   7 Seek_Error_Rate         0x000e   200   200   051    Old_age   Always       -       0
   9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       787
  10 Spin_Retry_Count        0x0012   100   100   051    Old_age   Always       -       0
  11 Calibration_Retry_Count 0x0012   100   100   051    Old_age   Always       -       0
- 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       185
-192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       174
-193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       195
+ 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       186
+192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       175
+193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       196
 194 Temperature_Celsius     0x0022   102   098   000    Old_age   Always       -       45
 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
 197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       36

The disk is just spinning up and down...
(btw, why the spin count increases but the spin time decreases???).
I really don't uderstand. Could it be the controller going nuts ?
Any idea ?
I really doubt that I had got 3 faulty drives in a pack and from different
brands.

Some data:
- kernel: 2.6.26-rc1-git6 + blk-lockin-fix + latest-semaphore-fixes
- hadrware:
00:1f.1 IDE interface: Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller (rev 02)
00:1f.2 IDE interface: Intel Corporation 82801EB (ICH5) SATA Controller (rev 02)
03:0a.0 RAID bus controller: Adaptec ASC-29320 U320 w/HostRAID (rev 03)
03:0a.1 RAID bus controller: Adaptec ASC-29320 U320 w/HostRAID (rev 03)

werewolf:~/soft/linux/kernel/patches/2.6.25-jam06# lsscsi -H
[0]    aic79xx
[1]    aic79xx
[2]    ata_piix
[3]    ata_piix
[4]    ata_piix
[5]    ata_piix

werewolf:~/soft/linux/kernel/patches/2.6.25-jam06# lsscsi
sdev_scandir_sort: left parse failed
sdev_scandir_sort: right parse failed
sdev_scandir_sort: left parse failed
sdev_scandir_sort: left parse failed
sdev_scandir_sort: right parse failed
sdev_scandir_sort: right parse failed
sdev_scandir_sort: right parse failed
sdev_scandir_sort: left parse failed
sdev_scandir_sort: right parse failed
sdev_scandir_sort: left parse failed
sdev_scandir_sort: right parse failed
sdev_scandir_sort: left parse failed
sdev_scandir_sort: left parse failed
sdev_scandir_sort: left parse failed
sdev_scandir_sort: right parse failed
sdev_scandir_sort: right parse failed
[target1:0:0]type?   vendor?  model?           rev?  -
[target1:0:1]type?   vendor?  model?           rev?  -
[target2:0:0]type?   vendor?  model?           rev?  -
[target2:0:1]type?   vendor?  model?           rev?  -
[target4:0:0]type?   vendor?  model?           rev?  -
[1:0:0:0]    disk    SEAGATE  ST336807LW       0C01  -
[1:0:1:0]    disk    SEAGATE  ST336807LW       0C01  -
[2:0:0:0]    disk    ATA      ST3120022A       3.06  -
[2:0:1:0]    cd/dvd  HL-DT-ST DVDRAM GSA-H10N  JL12  -
[4:0:0:0]    disk    ATA      WDC WD3200AVJS-6 12.0  -

(garbled output due to some bug in latest gits), but you can guess the sd? )

TIA

-- 
J.A. Magallon <jamagallon()ono!com>     \               Software is like sex:
                                         \         It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux 2.6.23-jam05 (gcc 4.2.2 20071128 (4.2.2-2mdv2008.1)) SMP PREEMPT
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ