[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47A75B8A.3020503@vlnb.net>
Date: Mon, 04 Feb 2008 21:38:02 +0300
From: Vladislav Bolkhovitin <vst@...b.net>
To: James Bottomley <James.Bottomley@...senPartnership.com>
CC: Bart Van Assche <bart.vanassche@...il.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
linux-scsi@...r.kernel.org, scst-devel@...ts.sourceforge.net,
linux-kernel@...r.kernel.org
Subject: Re: Integration of SCST in the mainstream Linux kernel
James Bottomley wrote:
> On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote:
>
>>James Bottomley wrote:
>>
>>>On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:
>>>
>>>
>>>>James Bottomley wrote:
>>>>
>>>>
>>>>>>>>So, James, what is your opinion on the above? Or the overall SCSI target
>>>>>>>>project simplicity doesn't matter much for you and you think it's fine
>>>>>>>>to duplicate Linux page cache in the user space to keep the in-kernel
>>>>>>>>part of the project as small as possible?
>>>>>>>
>>>>>>>
>>>>>>>The answers were pretty much contained here
>>>>>>>
>>>>>>>http://marc.info/?l=linux-scsi&m=120164008302435
>>>>>>>
>>>>>>>and here:
>>>>>>>
>>>>>>>http://marc.info/?l=linux-scsi&m=120171067107293
>>>>>>>
>>>>>>>Weren't they?
>>>>>>
>>>>>>No, sorry, it doesn't look so for me. They are about performance, but
>>>>>>I'm asking about the overall project's architecture, namely about one
>>>>>>part of it: simplicity. Particularly, what do you think about
>>>>>>duplicating Linux page cache in the user space to have zero-copy cached
>>>>>>I/O? Or can you suggest another architectural solution for that problem
>>>>>>in the STGT's approach?
>>>>>
>>>>>
>>>>>Isn't that an advantage of a user space solution? It simply uses the
>>>>>backing store of whatever device supplies the data. That means it takes
>>>>>advantage of the existing mechanisms for caching.
>>>>
>>>>No, please reread this thread, especially this message:
>>>>http://marc.info/?l=linux-kernel&m=120169189504361&w=2. This is one of
>>>>the advantages of the kernel space implementation. The user space
>>>>implementation has to have data copied between the cache and user space
>>>>buffer, but the kernel space one can use pages in the cache directly,
>>>>without extra copy.
>>>
>>>
>>>Well, you've said it thrice (the bellman cried) but that doesn't make it
>>>true.
>>>
>>>The way a user space solution should work is to schedule mmapped I/O
>>>from the backing store and then send this mmapped region off for target
>>>I/O. For reads, the page gather will ensure that the pages are up to
>>>date from the backing store to the cache before sending the I/O out.
>>>For writes, You actually have to do a msync on the region to get the
>>>data secured to the backing store.
>>
>>James, have you checked how fast is mmaped I/O if work size > size of
>>RAM? It's several times slower comparing to buffered I/O. It was many
>>times discussed in LKML and, seems, VM people consider it unavoidable.
>
>
> Erm, but if you're using the case of work size > size of RAM, you'll
> find buffered I/O won't help because you don't have the memory for
> buffers either.
James, just check and you will see, buffered I/O is a lot faster.
>>So, using mmaped IO isn't an option for high performance. Plus, mmaped
>>IO isn't an option for high reliability requirements, since it doesn't
>>provide a practical way to handle I/O errors.
>
> I think you'll find it does ... the page gather returns -EFAULT if
> there's an I/O error in the gathered region.
Err, to whom return? If you try to read from a mmaped page, which can't
be populated due to I/O error, you will get SIGBUS or SIGSEGV, I don't
remember exactly. It's quite tricky to get back to the faulted command
from the signal handler.
Or do you mean mmap(MAP_POPULATE)/munmap() for each command? Do you
think that such mapping/unmapping is good for performance?
> msync does something
> similar if there's a write failure.
>
>>>You also have to pull tricks with
>>>the mmap region in the case of writes to prevent useless data being read
>>>in from the backing store.
>>
>>Can you be more exact and specify what kind of tricks should be done for
>>that?
>
> Actually, just avoid touching it seems to do the trick with a recent
> kernel.
Hmm, how can one write to an mmaped page and don't touch it?
> James
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists