[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6f4014e-d199-d5e8-515c-5ffcd9946c80@gmail.com>
Date: Thu, 12 Jan 2023 13:49:14 -0800
From: Bart Van Assche <bart.vanassche@...il.com>
To: Al Viro <viro@...iv.linux.org.uk>,
Christoph Hellwig <hch@...radead.org>
Cc: David Howells <dhowells@...hat.com>,
Matthew Wilcox <willy@...radead.org>,
Jens Axboe <axboe@...nel.dk>, Jan Kara <jack@...e.cz>,
Jeff Layton <jlayton@...nel.org>,
Logan Gunthorpe <logang@...tatee.com>,
linux-fsdevel@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org,
"Martin K. Petersen" <martin.petersen@...cle.com>,
Douglas Gilbert <dgilbert@...erlog.com>
Subject: Re: [PATCH v5 3/9] iov_iter: Use IOCB/IOMAP_WRITE if available rather
than iterator direction
On 1/12/23 09:37, Al Viro wrote:
> On Thu, Jan 12, 2023 at 06:08:14AM -0800, Christoph Hellwig wrote:
>> On Thu, Jan 12, 2023 at 10:31:01AM +0000, David Howells wrote:
>>>> And use the information in the request for this one (see patch below),
>>>> and then move this patch first in the series, add an explicit direction
>>>> parameter in the gup_flags to the get/pin helper and drop iov_iter_rw
>>>> and the whole confusing source/dest information in the iov_iter entirely,
>>>> which is a really nice big tree wide cleanup that remove redundant
>>>> information.
>>>
>>> Fine by me, but Al might object as I think he wanted the internal checks. Al?
>>
>> I'm happy to have another discussion, but the fact the information in
>> the iov_iter is 98% redundant and various callers got it wrong and
>> away is a pretty good sign that we should drop this information. It
>> also nicely simplified the API.
>
> I have no problem with getting rid of iov_iter_rw(), but I would really like to
> keep ->data_source. If nothing else, any place getting direction wrong is
> a trouble waiting to happen - something that is currently dealing only with
> iovec and bvec might be given e.g. a pipe.
>
> Speaking of which, I would really like to get rid of the kludge /dev/sg is
> pulling - right now from-device requests there do the following:
> * copy the entire destination in (and better hope that nothing is mapped
> write-only, etc.)
> * form a request + bio, attach the pages with the destination copy to it
> * submit
> * copy the damn thing back to destination after the completion.
> The reason for that is (quoted in commit ecb554a846f8)
>
> ====
> The semantics of SG_DXFER_TO_FROM_DEV were:
> - copy user space buffer to kernel (LLD) buffer
> - do SCSI command which is assumed to be of the DATA_IN
> (data from device) variety. This would overwrite
> some or all of the kernel buffer
> - copy kernel (LLD) buffer back to the user space.
>
> The idea was to detect short reads by filling the original
> user space buffer with some marker bytes ("0xec" it would
> seem in this report). The "resid" value is a better way
> of detecting short reads but that was only added this century
> and requires co-operation from the LLD.
> ====
>
> IOW, we can't tell how much do we actually want to copy out, unless the SCSI driver
> in question is recent enough. Note that the above had been written in 2009, so
> it might not be an issue these days.
>
> Do we still have SCSI drivers that would not set the residual on bypass requests
> completion? Because I would obviously very much prefer to get rid of that
> copy in-overwrite-copy out thing there - given the accurate information about
> the transfer length it would be easy to do.
(+Martin and Doug)
I'm not sure that we still need the double copy in the sg driver. It
seems obscure to me that there is user space software that relies on
finding "0xec" in bytes not originating from a SCSI device.
Additionally, SCSI drivers that do not support residuals should be
something from the past.
Others may be better qualified to comment on this topic.
Bart.
Powered by blists - more mailing lists