lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20081216210251.138d3e98.akpm@linux-foundation.org>
Date:	Tue, 16 Dec 2008 21:02:51 -0800
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	Justin Madru <bevicm@...extreme.com>
Cc:	lkml <linux-kernel@...r.kernel.org>, linux-ide@...r.kernel.org,
	"Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: [2.6.28-rc] Sata soft reset filling log


(cc linux-ide).

This is a post-2.6.27 regression.  Panic!

On Fri, 12 Dec 2008 18:07:49 -0800 Justin Madru <bevicm@...extreme.com> wrote:

> I've been testing .28 (currently -rc8) and I've noticed in the logs a 
> massive amount of the following.
> I can confirm that the messages doesn't appear when booting into a .27 
> kernel.
> 
> ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
> ata2.00: ST_FIRST: !(DRQ|ERR|DF)
> ata2.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
>          cdb 1e 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
>          res 50/00:01:00:00:00/00:00:00:00:00/a0 Emask 0x2 (HSM violation)
> ata2.00: status: { DRDY }
> ata2: soft resetting link
> ata2.00: configured for UDMA/33
> ata2: EH complete
> 
> I get this block of messages about every 20 seconds for as long as I'm 
> booted into a .28 kernel.
> It seems that the sata link is being soft reseted about every _20_ secs.
> My drive is a sata drive that should be using UDMA/133 (right?),
> but the "error" message says configuring for UDMA/33.
> Does this mean that my drive is running at a slower speed?
> Should I be worried about the .28 kernel corrupting my hardware? Is this 
> a known issue?
> How can I stop it from filling my logs!
> 
> Some information about my computer:
> 
> $ lspci
> 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS, 943/940GML 
> and 945GT Express Memory Controller Hub (rev 03)
> 00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 
> 943/940GML Express Integrated Graphics Controller (rev 03)
> 00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 
> 943/940GML Express Integrated Graphics Controller (rev 03)
> 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High 
> Definition Audio Controller (rev 01)
> 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express 
> Port 1 (rev 01)
> 00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express 
> Port 4 (rev 01)
> 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
> Controller #1 (rev 01)
> 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
> Controller #2 (rev 01)
> 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
> Controller #3 (rev 01)
> 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI 
> Controller #4 (rev 01)
> 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI 
> Controller (rev 01)
> 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e1)
> 00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface 
> Bridge (rev 01)
> 00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) SATA 
> IDE Controller (rev 01)
> 00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller 
> (rev 01)
> 03:00.0 Ethernet controller: Broadcom Corporation BCM4401-B0 100Base-TX 
> (rev 02)
> 03:01.0 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller
> 03:01.1 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro 
> Host Adapter (rev 19)
> 03:01.2 System peripheral: Ricoh Co Ltd R5C843 MMC Host Controller (rev 0a)
> 03:01.3 System peripheral: Ricoh Co Ltd R5C592 Memory Stick Bus Host 
> Adapter (rev 05)
> 03:01.4 System peripheral: Ricoh Co Ltd xD-Picture Card Controller (rev ff)
> 0b:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG 
> [Golan] Network Connection (rev 02)
> 
> $ hdparm -I /dev/sda
> /dev/sda:
> ATA device, with non-removable media
>         Model Number:       ST980811AS
>         Serial Number:      5LY6J1M0
>         Firmware Revision:  3.CDD
> Standards:
>         Supported: 7 6 5 4
>         Likely used: 8
> Configuration:
>         Logical         max     current
>         cylinders       16383   16383
>         heads           16      16
>         sectors/track   63      63
>         --
>         CHS current addressable sectors:   16514064
>         LBA    user addressable sectors:  156301488
>         LBA48  user addressable sectors:  156301488
>         device size with M = 1024*1024:       76319 MBytes
>         device size with M = 1000*1000:       80026 MBytes (80 GB)
> Capabilities:
>         LBA, IORDY(can be disabled)
>         Queue depth: 32
>         Standby timer values: spec'd by Standard, no device specific minimum
>         R/W multiple sector transfer: Max = 16  Current = 8
>         Advanced power management level: 128
>         Recommended acoustic management value: 128, current value: 0
>         DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
>              Cycle time: min=120ns recommended=120ns
>         PIO: pio0 pio1 pio2 pio3 pio4
>              Cycle time: no flow control=240ns  IORDY flow control=120ns
> Commands/features:
>         Enabled Supported:
>            *    SMART feature set
>                 Security Mode feature set
>            *    Power Management feature set
>            *    Write cache
>            *    Look-ahead
>            *    Host Protected Area feature set
>            *    WRITE_BUFFER command
>            *    READ_BUFFER command
>            *    DOWNLOAD_MICROCODE
>            *    Advanced Power Management feature set
>                 SET_MAX security extension
>                 Automatic Acoustic Management feature set
>            *    48-bit Address feature set
>            *    Mandatory FLUSH_CACHE
>            *    FLUSH_CACHE_EXT
>            *    SMART error logging
>            *    SMART self-test
>            *    IDLE_IMMEDIATE with UNLOAD
>            *    SATA-I signaling speed (1.5Gb/s)
>            *    Native Command Queueing (NCQ)
>            *    Phy event counters
>                 Device-initiated interface power management
>            *    Software settings preservation
>            *    SMART Command Transport (SCT) feature set
> Security:
>         Master password revision code = 65534
>                 supported
>         not     enabled
>         not     locked
>                 frozen
>         not     expired: security count
>         not     supported: enhanced erase
> Checksum: correct
> 
> $ hdparm -fF /dev/sda; hdparm -tT /dev/sda
> /dev/sda:
>  Timing cached reads:   1650 MB in  2.00 seconds = 825.06 MB/sec
>  Timing buffered disk reads:  130 MB in  3.04 seconds =  42.79 MB/sec
> 
> $ smartctl -a /dev/sda
> smartctl version 5.38 [i686-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
> 
> === START OF INFORMATION SECTION ===
> Model Family:     Seagate Momentus 5400.3
> Device Model:     ST980811AS
> Serial Number:    5LY6J1M0
> Firmware Version: 3.CDD
> User Capacity:    80,026,361,856 bytes
> Device is:        In smartctl database [for details use: -P show]
> ATA Version is:   7
> ATA Standard is:  Exact ATA specification draft version not indicated
> Local Time is:    Fri Dec 12 17:05:06 2008 PST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
> See vendor-specific Attribute list for marginal Attributes.
> 
> General SMART Values:
> Offline data collection status:  (0x82) Offline data collection activity
>                                         was completed without error.
>                                         Auto Offline Data Collection: 
> Enabled.
> Self-test execution status:      (   0) The previous self-test routine 
> completed
>                                         without error or no self-test 
> has ever
>                                         been run.
> Total time to complete Offline
> data collection:                 ( 426) seconds.
> Offline data collection
> capabilities:                    (0x5b) SMART execute Offline immediate.
>                                         Auto Offline data collection 
> on/off support.
>                                         Suspend Offline collection upon new
>                                         command.
>                                         Offline surface scan supported.
>                                         Self-test supported.
>                                         No Conveyance Self-test supported.
>                                         Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
>                                         power-saving mode.
>                                         Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
>                                         No General Purpose Logging support.
> Short self-test routine
> recommended polling time:        (   2) minutes.
> Extended self-test routine
> recommended polling time:        (  84) minutes.
> SCT capabilities:              (0x0001) SCT Status supported.
> 
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
> UPDATED  WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x000f   100   253   006    Pre-fail  
> Always       -       0
>   3 Spin_Up_Time            0x0003   099   099   085    Pre-fail  
> Always       -       0
>   4 Start_Stop_Count        0x0032   099   099   020    Old_age   
> Always       -       1142
>   5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  
> Always       -       0
>   7 Seek_Error_Rate         0x000f   079   060   030    Pre-fail  
> Always       -       88807308
>   9 Power_On_Hours          0x0032   097   097   000    Old_age   
> Always       -       3420
>  10 Spin_Retry_Count        0x0013   100   100   034    Pre-fail  
> Always       -       0
>  12 Power_Cycle_Count       0x0032   099   099   020    Old_age   
> Always       -       1180
> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   
> Always       -       0
> 189 High_Fly_Writes         0x003a   100   100   000    Old_age   
> Always       -       0
> 190 Airflow_Temperature_Cel 0x0022   050   042   045    Old_age   
> Always   In_the_past 50 (0 102 52 25)
> 192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   
> Always       -       1076
> 193 Load_Cycle_Count        0x0032   001   001   000    Old_age   
> Always       -       241664
> 194 Temperature_Celsius     0x0022   050   058   000    Old_age   
> Always       -       50 (0 19 0 0)
> 195 Hardware_ECC_Recovered  0x001a   061   060   000    Old_age   
> Always       -       195105625
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age   
> Always       -       0
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   
> Offline      -       0
> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   
> Always       -       0
> 200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   
> Offline      -       0
> 202 TA_Increase_Count       0x0032   100   253   000    Old_age   
> Always       -       0
> 
> SMART Error Log Version: 1
> No Errors Logged
> 
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining  
> LifeTime(hours)  LBA_of_first_error
> # 1  Extended offline    Completed without error       00%      
> 3257         -
> # 2  Short offline       Completed without error       00%      
> 3256         -
> # 3  Short offline       Completed without error       00%      
> 2659         -
> # 4  Short offline       Completed without error       00%      
> 2659         -
> # 5  Short offline       Completed without error       00%      
> 1657         -
> # 6  Short offline       Completed without error       00%      
> 1621         -
> # 7  Short offline       Completed without error       00%         
> 2         -
> # 8  Short offline       Completed without error       00%         
> 2         -
> # 9  Short offline       Completed without error       00%         
> 1         -
> #10  Short offline       Completed without error       00%         
> 0         -
> 
> SMART Selective self-test log data structure revision number 1
>  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
>     1        0        0  Not_testing
>     2        0        0  Not_testing
>     3        0        0  Not_testing
>     4        0        0  Not_testing
>     5        0        0  Not_testing
> Selective self-test flags (0x0):
>   After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ