linux-kernel - Re: [Intel IOMMU][patch 3/8] Generic hardware support for Intel IOMMU.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 25 Apr 2007 09:17:31 +0800
From:	Shaohua Li <shaohua.li@...el.com>
To:	Andi Kleen <ak@...e.de>
Cc:	Ashok Raj <ashok.raj@...el.com>, linux-kernel@...r.kernel.org,
	akpm@...l.org, gregkh@...e.de, muli@...ibm.com,
	asit.k.mallick@...el.com, suresh.b.siddha@...el.com,
	anil.s.keshavamurthy@...el.com, arjan@...ux.intel.com
Subject: Re: [Intel IOMMU][patch 3/8] Generic hardware support for Intel
	IOMMU.

On Tue, 2007-04-24 at 21:27 +0200, Andi Kleen wrote:
> On Tuesday 24 April 2007 08:03:02 Ashok Raj wrote:
> >
> > +#ifdef CONFIG_DMAR
> > +#ifdef CONFIG_SMP
> > +static void dmar_msi_set_affinity(unsigned int irq, cpumask_t mask)
> 
> 
> Why does it need an own interrupt type?
> 
> > +
> > +config IOVA_GUARD_PAGE
> > +	bool "Enables gaurd page when allocating IO Virtual Address for IOMMU"
> > +	depends on DMAR
> > +
> > +config IOVA_NEXT_CONTIG
> > +	bool "Keeps IOVA allocations consequent between allocations"
> > +	depends on DMAR && EXPERIMENTAL
> 
> Needs reference to Intel and better description
> 
> The file should have a high level description what it is good for etc.
> 
> Need high level overview over what locks protects what and if there
> is a locking order.
> 
> It doesn't seem to enable sg merging? Since you have enough space 
> that should work.
We actually have a patch to do sg merge. In my test, it doesn't have any
performance gain.

> > +static char *fault_reason_strings[] =
> > +{
> > +	"Software",
> > +	"Present bit in root entry is clear",
> > +	"Present bit in context entry is clear",
> > +	"Invalid context entry",
> > +	"Access beyond MGAW",
> > +	"PTE Write access is not set",
> > +	"PTE Read access is not set",
> > +	"Next page table ptr is invalid",
> > +	"Root table address invalid",
> > +	"Context table ptr is invalid",
> > +	"non-zero reserved fields in RTP",
> > +	"non-zero reserved fields in CTP",
> > +	"non-zero reserved fields in PTE",
> > +	"Unknown"
> > +};
> > +
> > +#define MAX_FAULT_REASON_IDX 	(12)
> 
> 
> You got 14 of them. better use ARRAY_SIZE
> 
> > +#define IOMMU_NAME_LEN		(7)
> > +
> > +struct iommu {
> 
> call it intel_iommu or somesuch even when it's private.
> 
> > +static int __init intel_iommu_setup(char *str)
> > +{
> > +	if (!str)
> > +		return -EINVAL;
> > +	while (*str) {
> > +		if (!strncmp(str, "off", 3)) {
> > +			dmar_disabled = 1;
> > +			printk(KERN_INFO"Intel-IOMMU: disabled\n");
> > +		}
> > +		str += strcspn(str, ",");
> > +		while (*str == ',')
> > +			str++;
> > +	}
> > +	return 0;
> > +}
> > +__setup("intel_iommu=", intel_iommu_setup);
> 
> Why can't you just use the normal iommu=off for this? 
iommu=off disable all iommu, intel_iommu=off just disables intel_iommu.
Isn't possible people want to use other iommu like swiotlb?
> > +
> > +#define MIN_PGTABLE_PAGES	(10)
> > +static mempool_t *pgtable_mempool;
> > +#define MIN_DOMAIN_REQ		(20)
> > +static mempool_t *domain_mempool;
> > +#define MIN_DEVINFO_REQ		(20)
> > +static mempool_t *devinfo_mempool;
> 
> Lots of mempools. How much memory does this pin?
> 
> > +
> > +#define alloc_pgtable_page() mempool_alloc(pgtable_mempool, GFP_ATOMIC)
> > +#define free_pgtable_page(vaddr) mempool_free(vaddr, pgtable_mempool)
> > +#define alloc_domain_mem() mempool_alloc(domain_mempool, GFP_ATOMIC)
> > +#define free_domain_mem(vaddr) mempool_free(vaddr, domain_mempool)
> > +#define alloc_devinfo_mem() mempool_alloc(devinfo_mempool, GFP_ATOMIC)
> > +#define free_devinfo_mem(vaddr) mempool_free(vaddr, devinfo_mempool)
> 
> Do we need the macros? Better expand them in the caller.
> 
> > +static void __iommu_flush_cache(struct iommu *iommu, void *addr, int size)
> > +{
> > +	if (!ecap_coherent(iommu->ecap))
> > +		clflush_cache_range(addr, size);
> > +}
> > +
> > +#define iommu_flush_cache_entry(iommu, addr) \
> > +	__iommu_flush_cache(iommu, addr, 8)
> > +#define iommu_flush_cache_page(iommu, addr) \
> > +	__iommu_flush_cache(iommu, addr, PAGE_SIZE_4K)
> 
> Similar.
> 
> And the 8 should be probably something more descriptive (sizeof?)
> 
> > +/* context entry handling */
> > +static struct context_entry * device_to_context_entry(struct iommu *iommu,
> > +		u8 bus, u8 devfn)
> > +{
> > +	struct root_entry *root;
> > +	struct context_entry *context;
> > +	unsigned long phy_addr;
> > +	unsigned long flags;
> > +
> > +	spin_lock_irqsave(&iommu->lock, flags);
> > +	root = &iommu->root_entry[bus];
> > +	if (!root_present(*root)) {
> > +		phy_addr = (unsigned long)alloc_pgtable_page();
> 
> A GFP_ATOMIC mempool is rather useless. mempool only works if it can block
> for someone else freeing memory and if it can't do that it's not failsafe.
> I'm afraid you need to revise the allocation strategy -- best would be
> to somehow move the memory allocations outside the spinlock paths
> and preallocate if possible.
The problem is pci_map_single and friends usually called with interrupt
disabled or spin locked, so we must use GFP_ATOMIC.

> Same problem in other code.
> 
> > +		if (!dma_pte_present(*pte)) {
> > +			tmp = alloc_pgtable_page();
> 
> Please don't name variable tmp. I know some other code does it, but it's
> just bad style imho.
> 
> 
> > +	/* Make sure hardware complete it */
> > +	start_time = jiffies;
> > +	while (1) {
> > +		sts = dmar_readl(iommu->reg, DMAR_GSTS_REG);
> > +		if (sts & DMA_GSTS_RTPS)
> > +			break;
> > +		if (time_after(jiffies, start_time + DMAR_OPERATION_TIMEOUT))
> > +			panic("DMAR hardware is malfunctional, please disable IOMMU\n");
> > +		cpu_relax();
> > +	}
> 
> Could MWAIT/MONITOR be used for this?
MWAIT/MONITOR just works for cached memory. And this is memory mapped io.

Thanks,
Shaohua
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/