[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070611112950.GA18203@araj-amd64.jf.intel.com>
Date: Mon, 11 Jun 2007 04:29:50 -0700
From: Ashok Raj <ashok.raj@...el.com>
To: Andi Kleen <ak@...e.de>
Cc: "Keshavamurthy, Anil S" <anil.s.keshavamurthy@...el.com>,
Christoph Lameter <clameter@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel@...r.kernel.org, gregkh@...e.de, muli@...ibm.com,
asit.k.mallick@...el.com, suresh.b.siddha@...el.com,
arjan@...ux.intel.com, ashok.raj@...el.com, shaohua.li@...el.com,
davem@...emloft.net
Subject: Re: [Intel-IOMMU 02/10] Library routine for pre-allocat pool handling
On Tue, Jun 12, 2007 at 12:25:57AM +0200, Andi Kleen wrote:
>
> > Please advice.
>
> I think the short term only safe option would be to fully preallocate an aperture.
> If it is too small you can try GFP_ATOMIC but it would be just
> a unreliable fallback. For safety you could perhaps have some kernel thread
> that tries to enlarge it in the background depending on current
> use. That would be not 100% guaranteed to keep up with load,
> but would at least keep up if the system is not too busy.
>
> That is basically what your resource pools do, but they seem
> to be unnecessarily convoluted for the task :- after all you
> could just preallocate the page tables and rewrite/flush them without
> having some kind of allocator inbetween, can't you?
Each iommu has multiple domains, where each domain represents an
address space. PCIexpress endpoints can be located on its own domain
for addr protection reasons, and also have its own tag for iotlb cache.
each addr space can be either a 3 or 4 level. So it would be hard to predict
how much to setup ahead of time for each domain/device.
Its not a simple single level table with a small window like the gart case.
Just keeping a pool of page sized pages its easy to respond and use where its
really necessary without having to lock pages down without knowing real demand.
The addr space is plenty, so growing on demand is the best use of memory
available.
> If you make the start value large enough (256+MB?) that might reasonably
> work. How much memory in page tables would that take? Or perhaps scale
> it with available memory or available devices.
>
> In theory it could also be precomputed from the block/network device queue
> lengths etc.; the trouble is just such checks would need to be added to all kinds of
> other odd subsystems that manage devices too. That would be much more work.
>
> Some investigation how to do sleeping block/network submit would be
> also interesting (e.g. replace the spinlocks there with mutexes and see how
> much it affects performance). For networking you would need to keep
> at least a non sleeping path though because packets can be legally
> submitted from interrupt context. If it works out then sleeping
> interfaces to the IOMMU code could be added.
>
> -Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists