[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080926101954.GW2677@kernel.dk>
Date: Fri, 26 Sep 2008 12:19:56 +0200
From: Jens Axboe <jens.axboe@...cle.com>
To: Alan Cox <alan@...rguk.ukuu.org.uk>
Cc: marty <martyleisner@...oo.com>, linux-kernel@...r.kernel.org,
martin.leisner@...ox.com
Subject: Re: disk IO directly from PCI memory to block device sectors
On Fri, Sep 26 2008, Alan Cox wrote:
> On Fri, 26 Sep 2008 11:11:35 +0200
> Jens Axboe <jens.axboe@...cle.com> wrote:
>
> > On Fri, Sep 26 2008, Alan Cox wrote:
> > > > What I'm looking is for a more generic/driver independent way of sticking
> > > > contents of PCI ram onto a disk.
> > >
> > > Ermm seriously why not have a userspace task with the PCI RAM mmapped
> > > and just use write() like normal sane people do ?
> >
> > To avoid the fault and copy, I would assume.
>
> It's a write to a raw partition so with O_DIRECT you won't have to copy
> and MAP_POPULATE will premap the object if even the first write wants to
> occur without faulting overhead.
You are still going through get_user_pages() for each write. As I would
imagine the writes would generally be large, the hit would not be too
bad (but it's still there).
Depending on the hardware, it may or may not be a big deal. But the path
from device to disk is definitely a lot bigger and more complex with the
mmap/write approach.
Another alternative would be using splice - if the pci device exposed a
char device node, you could support ->splice_read() there which would
just fill the pages into the pipe buffer. Then change the block device
fops ->splice_write() to go direct to the block device through a bio
instead of using the page cache based generic_file_splice_write(). Such
a change would actually make sense to do, if the block device has been
opened with O_DIRECT. And it would get you about the same performance as
doing it in-kernel, the only extra overhead would be two syscalls per
64k (well probably only one extra syscall, since you probably need an
ioctl/syscall to initiate the in-kernel activity as well). So just about
as free as you could get.
--
Jens Axboe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists