lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <45cdf1c2-9056-4ac2-8e4d-4f07996a9267@kernel.org>
Date: Wed, 7 Aug 2024 11:26:46 -0700
From: Damien Le Moal <dlemoal@...nel.org>
To: Christian Heusel <christian@...sel.eu>, Igor Pylypiv
 <ipylypiv@...gle.com>, Niklas Cassel <cassel@...nel.org>,
 linux-ide@...r.kernel.org
Cc: Hannes Reinecke <hare@...e.de>, regressions@...ts.linux.dev,
 stable@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [REGRESSION][BISECTED][STABLE] hdparm errors since 28ab9769117c

On 2024/08/07 10:23, Christian Heusel wrote:
> Hello Igor, hello Niklas,
> 
> on my NAS I am encountering the following issue since v6.6.44 (LTS),
> when executing the hdparm command for my WD-WCC7K4NLX884 drives to get
> the active or standby state:
> 
>     $ hdparm -C /dev/sda
>     /dev/sda:
>     SG_IO: bad/missing sense data, sb[]:  f0 00 01 00 50 40 ff 0a 00 00 78 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>      drive state is:  unknown
> 
> 
> While the expected output is the following:
> 
>     $ hdparm -C /dev/sda
>     /dev/sda:
>      drive state is:  active/idle
> 
> I did a bisection within the stable series and found the following
> commit to be the first bad one:
> 
>     28ab9769117c ("ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no error")
> 
> According to kernel.dance the same commit was also backported to the
> v6.10.3 and v6.1.103 stable kernels and I could not find any commit or
> pending patch with a "Fixes:" tag for the offending commit.
> 
> So far I have not been able to test with the mainline kernel as this is
> a remote device which I couldn't rescue in case of a boot failure. Also
> just for transparency it does have the out of tree ZFS module loaded,
> but AFAIU this shouldn't be an issue here, as the commit seems clearly
> related to the error. If needed I can test with an untainted mainline
> kernel on Friday when I'm near the device.
> 
> I have attached the output of hdparm -I below and would be happy to
> provide further debug information or test patches.

I confirm this, using 6.11-rc2. The problem is actually hdparm code which
assumes that the sense data is in descriptor format without ever looking at the
D_SENSE bit to verify that. So commit 28ab9769117c reveals this issue because as
its title explains, it (correctly) honors D_SENSE instead of always generating
sense data in descriptor format.

Hmm... This is annoying. The kernel is fixed to be spec compliant but that
breaks old/non-compliant applications... We definitely should fix hdparm code,
but I think we still need to revert 28ab9769117c...

Niklas, Igor, thoughts ?

> 
> Cheers,
> Christian
> 
> ---
> 
> #regzbot introduced: 28ab9769117c
> #regzbot title: ata: libata-scsi: Sense data errors breaking hdparm with WD drives
> 
> ---
> 
> $ pacman -Q hdparm
> hdparm 9.65-2
> 
> $ hdparm -I /dev/sda
> 
> /dev/sda:
> 
> ATA device, with non-removable media
> 	Model Number:       WDC WD40EFRX-68N32N0
> 	Serial Number:      WD-WCC7K4NLX884
> 	Firmware Revision:  82.00A82
> 	Transport:          Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
> Standards:
> 	Used: unknown (minor revision code 0x006d) 
> 	Supported: 10 9 8 7 6 5 
> 	Likely used: 10
> Configuration:
> 	Logical		max	current
> 	cylinders	16383	0
> 	heads		16	0
> 	sectors/track	63	0
> 	--
> 	LBA    user addressable sectors:   268435455
> 	LBA48  user addressable sectors:  7814037168
> 	Logical  Sector size:                   512 bytes
> 	Physical Sector size:                  4096 bytes
> 	Logical Sector-0 offset:                  0 bytes
> 	device size with M = 1024*1024:     3815447 MBytes
> 	device size with M = 1000*1000:     4000787 MBytes (4000 GB)
> 	cache/buffer size  = unknown
> 	Form Factor: 3.5 inch
> 	Nominal Media Rotation Rate: 5400
> Capabilities:
> 	LBA, IORDY(can be disabled)
> 	Queue depth: 32
> 	Standby timer values: spec'd by Standard, with device specific minimum
> 	R/W multiple sector transfer: Max = 16	Current = 16
> 	DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 
> 	     Cycle time: min=120ns recommended=120ns
> 	PIO: pio0 pio1 pio2 pio3 pio4 
> 	     Cycle time: no flow control=120ns  IORDY flow control=120ns
> Commands/features:
> 	Enabled	Supported:
> 	   *	SMART feature set
> 	    	Security Mode feature set
> 	   *	Power Management feature set
> 	   *	Write cache
> 	   *	Look-ahead
> 	   *	Host Protected Area feature set
> 	   *	WRITE_BUFFER command
> 	   *	READ_BUFFER command
> 	   *	NOP cmd
> 	   *	DOWNLOAD_MICROCODE
> 	    	Power-Up In Standby feature set
> 	   *	SET_FEATURES required to spinup after power up
> 	    	SET_MAX security extension
> 	   *	48-bit Address feature set
> 	   *	Device Configuration Overlay feature set
> 	   *	Mandatory FLUSH_CACHE
> 	   *	FLUSH_CACHE_EXT
> 	   *	SMART error logging
> 	   *	SMART self-test
> 	   *	General Purpose Logging feature set
> 	   *	64-bit World wide name
> 	   *	IDLE_IMMEDIATE with UNLOAD
> 	   *	WRITE_UNCORRECTABLE_EXT command
> 	   *	{READ,WRITE}_DMA_EXT_GPL commands
> 	   *	Segmented DOWNLOAD_MICROCODE
> 	   *	Gen1 signaling speed (1.5Gb/s)
> 	   *	Gen2 signaling speed (3.0Gb/s)
> 	   *	Gen3 signaling speed (6.0Gb/s)
> 	   *	Native Command Queueing (NCQ)
> 	   *	Host-initiated interface power management
> 	   *	Phy event counters
> 	   *	Idle-Unload when NCQ is active
> 	   *	NCQ priority information
> 	   *	READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
> 	   *	DMA Setup Auto-Activate optimization
> 	   *	Device-initiated interface power management
> 	   *	Software settings preservation
> 	   *	SMART Command Transport (SCT) feature set
> 	   *	SCT Write Same (AC2)
> 	   *	SCT Error Recovery Control (AC3)
> 	   *	SCT Features Control (AC4)
> 	   *	SCT Data Tables (AC5)
> 	    	unknown 206[12] (vendor specific)
> 	    	unknown 206[13] (vendor specific)
> 	   *	DOWNLOAD MICROCODE DMA command
> 	   *	WRITE BUFFER DMA command
> 	   *	READ BUFFER DMA command
> Security: 
> 	Master password revision code = 65534
> 		supported
> 	not	enabled
> 	not	locked
> 		frozen
> 	not	expired: security count
> 		supported: enhanced erase
> 	504min for SECURITY ERASE UNIT. 504min for ENHANCED SECURITY ERASE UNIT.
> Logical Unit WWN Device Identifier: 50014ee2647735a1
> 	NAA		: 5
> 	IEEE OUI	: 0014ee
> 	Unique ID	: 2647735a1
> Checksum: correct

-- 
Damien Le Moal
Western Digital Research


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ