[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0807111612270.8297@devserv.devel.redhat.com>
Date: Fri, 11 Jul 2008 16:22:16 -0400 (EDT)
From: Mikulas Patocka <mpatocka@...hat.com>
To: David Miller <davem@...emloft.net>
cc: fujita.tomonori@....ntt.co.jp, sparclinux@...r.kernel.org,
linux-kernel@...r.kernel.org, jens.axboe@...cle.com
Subject: Re: [SUGGESTION]: drop virtual merge accounting in I/O requests
On Fri, 11 Jul 2008, David Miller wrote:
> From: FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>
> Date: Fri, 11 Jul 2008 20:15:52 +0900
>
>> On Fri, 11 Jul 2008 06:52:09 -0400 (EDT)
>> Mikulas Patocka <mpatocka@...hat.com> wrote:
>>
>>> On Fri, 11 Jul 2008, FUJITA Tomonori wrote:
>>>
>>>> Yeah, IOMMUs can't guarantee that. The majority of architectures set
>>>> BIO_VMERGE_BOUNDARY to 0 so they don't hit this, I think.
>>>
>>> Yes, the architectures without IOMMU don't hit this problem.
>>
>> I meant that even if some architectures support IOMMUs, they set
>> BIO_VMERGE_BOUNDARY to 0.
>
> Keep in mind that these settings were added long before
> we supported segment boundary restrictions.
>
> Someone added code to handle segment boundaries, but didn't
> fix any of the block I/O layer infrastructure :-)
>
> Several platforms that have IOMMU but set these values to zero
> actually did so for another reason. They considered being
> required to always merge page-adjacent mappings virtually too
> strong a requirement to meet %100 of the time.
It is broken on Sparc64 even without boundary restrictions --- if you skip
over already allocated entry in IOMMU table, you don't merge too.
I'd just drop it, because these requirements seem to me too brittle to
maintain. It is too easy to make bug here and too hard to check for it.
Basically there are few independent code parts (I/O layer and
arch-specific IOMMUs) that are attempting to do the same calculation and
if they differ, the driver will crash. Even if we managed to fix it,
someone will likely break it again after year or two :-(
Would it mean that nr_hw_segments entry in bio and request could be
dropped too? Or is it used for some other purpose?
BTW.: what's the reason that by default (without any driver intervention)
device DMA is restricted to cross 64k boundary?
Mikulas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists