[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1200002054.3141.103.camel@localhost.localdomain>
Date: Thu, 10 Jan 2008 15:54:14 -0600
From: James Bottomley <James.Bottomley@...senPartnership.com>
To: Pete Wyckoff <pw@....edu>
Cc: FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>, tomof@....org,
deepakrc@...il.com, linux-scsi@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bsg : Add support for io vectors in bsg
On Thu, 2008-01-10 at 16:46 -0500, Pete Wyckoff wrote:
> James.Bottomley@...senPartnership.com wrote on Thu, 10 Jan 2008 14:55 -0600:
> > On Thu, 2008-01-10 at 15:43 -0500, Pete Wyckoff wrote:
> > > fujita.tomonori@....ntt.co.jp wrote on Wed, 09 Jan 2008 09:11 +0900:
> > > > On Tue, 8 Jan 2008 17:09:18 -0500
> > > > Pete Wyckoff <pw@....edu> wrote:
> > > > > I took another look at the compat approach, to see if it is feasible
> > > > > to keep the compat handling somewhere else, without the use of #ifdef
> > > > > CONFIG_COMPAT and size-comparison code inside bsg.c. I don't see how.
> > > > > The use of iovec is within a write operation on a char device. It's
> > > > > not amenable to a compat_sys_ or a .compat_ioctl approach.
> > > > >
> > > > > I'm partial to #1 because the use of architecture-independent fields
> > > > > matches the rest of struct sg_io_v4. But if you don't want to have
> > > > > another iovec type in the kernel, could we do #2 but just return
> > > > > -EINVAL if the need for compat is detected? I.e. change
> > > > > dout_iovec_count to dout_iovec_length and do the math?
> > > >
> > > > If you are ok with removing the write/read interface and just have
> > > > ioctl, we could can handle comapt stuff like others do. But I think
> > > > that you (OSD people) really want to keep the write/read
> > > > interface. Sorry, I think that there is no workaround to support iovec
> > > > in bsg.
> > >
> > > I don't care about read/write in particular. But we do need some
> > > way to launch asynchronous SCSI commands, and currently read/write
> > > are the only way to do that in bsg. The reason is to keep multiple
> > > spindles busy at the same time.
> >
> > Won't multi-threading the ioctl calls achieve the same effect? Or do
> > you trip over BKL there?
>
> There's no BKL on (new) ioctls anymore, at least. A thread per
> device would be feasible perhaps. But if you want any sort of
> pipelining out of the device, esp. in the remote iSCSI case, you
> need to have a good number of commands outstanding to each device.
> So a thread per command per device. Typical iSCSI queue depth of
> 128 times 16 devices for a small setup is a lot of threads.
I was actually thinking of a thread per outstanding command.
> The pthread/pipe latency overhead is not insignificant for fast
> storage networks too.
>
> > > How about these new ioctls instead of read/write:
> > >
> > > SG_IO_SUBMIT - start a new blk_execute_rq_nowait()
> > > SG_IO_TEST - complete and return a previous req
> > > SG_IO_WAIT - wait for a req to finish, interruptibly
> > >
> > > Then old write users will instead do ioctl SUBMIT. Read users will
> > > do TEST for non-blocking fd, or WAIT for blocking. And SG_IO could
> > > be implemented as SUBMIT + WAIT.
> > >
> > > Then we can do compat_ioctl and convert up iovecs out-of-line before
> > > calling the normal functions.
> > >
> > > Let me know if you want a patch for this.
> >
> > Really, the thought of re-inventing yet another async I/O interface
> > isn't very appealing.
>
> I'm fine with read/write, except Tomo is against handling iovecs
> because of the compat complexity with struct iovec being different
> on 32- vs 64-bit. There is a standard way to do "compat" ioctl that
> hides this handling in a different file (not bsg.c), which is the
> only reason I'm even considering these ioctls. I don't care about
> compat setups per se.
>
> Is there another async I/O mechanism? Userspace builds the CDBs,
> just needs some way to drop them in SCSI ML. BSG is almost perfect
> for this, but doesn't do iovec, leading to lots of memcpy.
No, it's just that async interfaces in Linux have a long and fairly
unhappy history.
James
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists