[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4A0A68F2.1040600@kernel.org>
Date: Wed, 13 May 2009 15:30:10 +0900
From: Tejun Heo <tj@...nel.org>
To: James Bottomley <James.Bottomley@...senPartnership.com>
CC: FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
bharrosh@...asas.com, axboe@...nel.dk,
linux-kernel@...r.kernel.org, jeff@...zik.org,
linux-ide@...r.kernel.org, linux-scsi@...r.kernel.org,
bzolnier@...il.com, petkovbb@...glemail.com,
sshtylyov@...mvista.com, mike.miller@...com, Eric.Moore@....com,
stern@...land.harvard.edu, zaitcev@...hat.com,
Geert.Uytterhoeven@...ycom.com, sfr@...b.auug.org.au,
grant.likely@...retlab.ca, paul.clements@...eleye.com,
tim@...erelk.net, jeremy@...source.com, adrian@...en.demon.co.uk,
oakad@...oo.com, dwmw2@...radead.org, schwidefsky@...ibm.com,
ballabio_dario@....com, davem@...emloft.net, rusty@...tcorp.com.au,
Markus.Lidel@...dowconnect.com, dgilbert@...erlog.com,
djwong@...ibm.com
Subject: Re: [PATCH 03/11] block: add rq->resid_len
Hello, James.
James Bottomley wrote:
>> Shouldn't those be request successful w/ sense data? Please note that
>> the term "error" in this context means failure of block layer request
>> not SCSI layer CHECK SENSE.
>
> Heh, well, this is where we get into interpretations. For SG_IO
> requests, we have three separate ways of returning error. The error
> return to the block layer, the results return and the sense code. The
> error to block is a somewhat later addition to the layer, so not all
> cases are handled right or agreed (for instance we just altered BLOCK_PC
> with recovered error from -EIO to no error). So hopefully we've just
> decided that deferred and current but recovered all fall into the no
> error to block, but results and sense to the user.
>
> Note that the error to block is basically discarded from SG_IO before we
> return to the user, so the user *only* has results and sense to go by,
> thus the concept of residual not valid on error to block is something
> the user can't check. That's why a consistent definition in all cases
> (i.e. the amount of data the HBA transferred) is the correct one and
> allows userspace to make the determination of what it should do based on
> the returns it gets.
Okay, I was thinking SG_IO will return error for rq failures and I
remember pretty clearly following the failure path recently while
debugging eject problem but my memory is pretty unreliable.
Checking... oops, yeap, you're right.
>> I'm still reluctant to do it because...
>>
>> * Its definition still isn't clear (well, at least to me) and if it's
>> defined as the number of valid bytes on request success and the
>> number of bytes HBA transferred on request failure, I don't think
>> it's all that useful.
>
> It's not valid bytes in either case ... it's number transferred. One
> can infer from a successful SCSI status code that number transferred ==
> valid bytes, but I'd rather we didn't say that.
>
>> * Seen from userland, residue count on request failure has never been
>> guaranteed and there doesn't seem to be any valid in kernel user.
>
> But that's the point ... we don't define for userland what request
> failure is very well.
>
>> * It would be extra code explicitly setting the residue count to full
>> on failure path. If it's something necessary, full residue count on
>> failure needs to be made default. If not, it will only add more
>> confusion.
>
> OK, so if what you're asking is that we can regard the residue as
> invalid if SG_IO itself returns an error, then I can agree ... but not
> if blk_end_request() returns error, because that error gets ignored by
> SG_IO.
I was confused that rq failure would cause error return from SG_IO.
Sorry about that. There still is a problem tho. Buffer for a bounced
SG_IO request is copied back on failure but when a bounced kernel PC
request fails, the data is not copied back in bio_copy_kern_endio().
This is what would break Boaz's code.
So, it seems what we should do is
1. Always copy back bounced buffer whether the request failed or not.
Whether resid_len should be considered while copying back, I'm not
sure about given that resid_len isn't properly implemented in some
drivers.
2. Revert the original behavior of setting resid_len to full on
request issue and audit the affected code paths.
How does it sound?
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists