linux-kernel - Re: [PATCH 03/11] block: add rq->resid

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4A0A68F2.1040600@kernel.org>
Date:	Wed, 13 May 2009 15:30:10 +0900
From:	Tejun Heo <tj@...nel.org>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
CC:	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
	bharrosh@...asas.com, axboe@...nel.dk,
	linux-kernel@...r.kernel.org, jeff@...zik.org,
	linux-ide@...r.kernel.org, linux-scsi@...r.kernel.org,
	bzolnier@...il.com, petkovbb@...glemail.com,
	sshtylyov@...mvista.com, mike.miller@...com, Eric.Moore@....com,
	stern@...land.harvard.edu, zaitcev@...hat.com,
	Geert.Uytterhoeven@...ycom.com, sfr@...b.auug.org.au,
	grant.likely@...retlab.ca, paul.clements@...eleye.com,
	tim@...erelk.net, jeremy@...source.com, adrian@...en.demon.co.uk,
	oakad@...oo.com, dwmw2@...radead.org, schwidefsky@...ibm.com,
	ballabio_dario@....com, davem@...emloft.net, rusty@...tcorp.com.au,
	Markus.Lidel@...dowconnect.com, dgilbert@...erlog.com,
	djwong@...ibm.com
Subject: Re: [PATCH 03/11] block: add rq->resid_len

Hello, James.

James Bottomley wrote:
>> Shouldn't those be request successful w/ sense data?  Please note that
>> the term "error" in this context means failure of block layer request
>> not SCSI layer CHECK SENSE.
> 
> Heh, well, this is where we get into interpretations.  For SG_IO
> requests, we have three separate ways of returning error.  The error
> return to the block layer, the results return and the sense code.  The
> error to block is a somewhat later addition to the layer, so not all
> cases are handled right or agreed (for instance we just altered BLOCK_PC
> with recovered error from -EIO to no error).  So hopefully we've just
> decided that deferred and current but recovered all fall into the no
> error to block, but results and sense to the user.
> 
> Note that the error to block is basically discarded from SG_IO before we
> return to the user, so the user *only* has results and sense to go by,
> thus the concept of residual not valid on error to block is something
> the user can't check.  That's why a consistent definition in all cases
> (i.e. the amount of data the HBA transferred) is the correct one and
> allows userspace to make the determination of what it should do based on
> the returns it gets.

Okay, I was thinking SG_IO will return error for rq failures and I
remember pretty clearly following the failure path recently while
debugging eject problem but my memory is pretty unreliable.
Checking... oops, yeap, you're right.

>> I'm still reluctant to do it because...
>>
>> * Its definition still isn't clear (well, at least to me) and if it's
>>   defined as the number of valid bytes on request success and the
>>   number of bytes HBA transferred on request failure, I don't think
>>   it's all that useful.
> 
> It's not valid bytes in either case ... it's number transferred.  One
> can infer from a successful SCSI status code that number transferred ==
> valid bytes, but I'd rather we didn't say that.
> 
>> * Seen from userland, residue count on request failure has never been
>>   guaranteed and there doesn't seem to be any valid in kernel user.
> 
> But that's the point ... we don't define for userland what request
> failure is very well.
> 
>> * It would be extra code explicitly setting the residue count to full
>>   on failure path.  If it's something necessary, full residue count on
>>   failure needs to be made default.  If not, it will only add more
>>   confusion.
> 
> OK, so if what you're asking is that we can regard the residue as
> invalid if SG_IO itself returns an error, then I can agree ... but not
> if blk_end_request() returns error, because that error gets ignored by
> SG_IO.

I was confused that rq failure would cause error return from SG_IO.
Sorry about that.  There still is a problem tho.  Buffer for a bounced
SG_IO request is copied back on failure but when a bounced kernel PC
request fails, the data is not copied back in bio_copy_kern_endio().
This is what would break Boaz's code.

So, it seems what we should do is

1. Always copy back bounced buffer whether the request failed or not.
   Whether resid_len should be considered while copying back, I'm not
   sure about given that resid_len isn't properly implemented in some
   drivers.

2. Revert the original behavior of setting resid_len to full on
   request issue and audit the affected code paths.

How does it sound?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/