linux-kernel - Re: [PATCH] JMicron JM20337 USB-SATA data corruption bugfix

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 21 Jul 2008 23:31:17 -0600
From:	Robert Hancock <hancockr@...w.ca>
To:	Tomas Styblo <tripie@...n.org>
Cc:	linux-kernel@...r.kernel.org, linux-usb@...r.kernel.org,
	usb-storage@...ts.one-eyed-alien.net,
	Alan Stern <stern@...land.harvard.edu>
Subject: Re: [PATCH] JMicron JM20337 USB-SATA data corruption bugfix -	device
 152d:2338

Tomas Styblo wrote:
> * Robert Hancock <hancockr@...w.ca> [Mon, 21 Jul 2008]:
>> I'm not sure this is a good approach. More that this code right above in 
>> usb_stor_invoke_transport, which your code undoes the effect of for this 
>> device, doesn't seem right:
>>
>> 	/* If things are really okay, then let's show that.  Zero
>> 	 * out the sense buffer so the higher layers won't realize
>> 	 * we did an unsolicited auto-sense. */
>> 	if (result == USB_STOR_TRANSPORT_GOOD &&
>> 		/* Filemark 0, ignore EOM, ILI 0, no sense */
>> 			(srb->sense_buffer[2] & 0xaf) == 0 &&
>> 		/* No ASC or ASCQ */
>> 			srb->sense_buffer[12] == 0 &&
>> 			srb->sense_buffer[13] == 0) {
>> 		srb->result = SAM_STAT_GOOD;
>> 		srb->sense_buffer[0] = 0x0;
>> 	}
>>
> 
> The patch doesn't exactly undo the effect of the code above,
> because the value of _result_ is different. When this problem

Yes, you and Alan are right, that code only triggers on a good result.

> happens, the condition above is false, _result_ is
> USB_STOR_TRANSPORT_FAILED and scsi_get_resid(srb) > 0, but the
> chipset doesn't report any error (NO_SENSE,ASC==0,ASCQ==0). That's
> why I think there's something wrong with the chipset. 

I'm assuming what triggers the transport failure is it getting a 
US_BULK_STAT_FAIL result from the transfer inside 
usb_stor_Bulk_transport. It could be there's some condition in the chip 
that causes that error, yet doesn't provide any sense data.

In any case, given that your code apparently fixes the corruption it 
seems that srb->result is being set to SAM_STAT_CHECK_CONDITION, but the 
DID_ERROR and SUGGEST_RETRY flags are not being set. Presumably then the 
  SCSI layer looks at the sense data and says "hmm, nothing to worry 
about here" and carries on.

I think we do need something like your patch, though it should likely be 
moved inside the if (need_auto_sense) check, and I don't see a reason to 
limit to this device ID only.

> 
> There are Windows users on various message boards who report the
> same problem with this chipset - a kind of silent data corruption
> that occurs only when copying large amounts of data.
> 
> But as I said I know little about SCSI and USB. I tried to locate
> and fix the problem, but I can't tell whether the current error
> handling code is written according to the relevant standards.
> 
> A more generic approach would certainly be better than hardcoded
> device ids. Perhaps this check should be enabled for all devices?
> Why not?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/