linux-kernel - Re: [PATCH 9/9] x86/iommu: use dma_ops_list in get_dma

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20080926123243.GE27928@amd.com>
Date:	Fri, 26 Sep 2008 14:32:43 +0200
From:	Joerg Roedel <joerg.roedel@....com>
To:	Amit Shah <amit.shah@...hat.com>
CC:	linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
	iommu@...ts.linux-foundation.org,
	David Woodhouse <dwmw2@...radead.org>,
	Muli Ben-Yehuda <muli@...ibm.com>,
	Ingo Molnar <mingo@...hat.com>,
	FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>
Subject: Re: [PATCH 9/9] x86/iommu: use dma_ops_list in get_dma_ops

On Fri, Sep 26, 2008 at 04:19:51PM +0530, Amit Shah wrote:
> * On Friday 26 Sep 2008 14:29:24 Joerg Roedel wrote:
> > On Fri, Sep 26, 2008 at 01:26:19PM +0530, Amit Shah wrote:
> > > * On Monday 22 Sep 2008 23:51:21 Joerg Roedel wrote:
> > > > This patch enables stackable dma_ops on x86. To do this, it also
> > > > enables the per-device dma_ops on i386.
> > > >
> > > > Signed-off-by: Joerg Roedel <joerg.roedel@....com>
> > > > ---
> > > >  arch/x86/kernel/pci-dma.c     |   26 ++++++++++++++++++++++++++
> > > >  include/asm-x86/device.h      |    6 +++---
> > > >  include/asm-x86/dma-mapping.h |   14 +++++++-------
> > > >  3 files changed, 36 insertions(+), 10 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
> > > > index b990fb6..2e517c2 100644
> > > > --- a/arch/x86/kernel/pci-dma.c
> > > > +++ b/arch/x86/kernel/pci-dma.c
> > > > @@ -82,6 +82,32 @@ void x86_register_dma_ops(struct dma_mapping_ops
> > > > *ops, write_unlock_irqrestore(&dma_ops_list_lock, flags);
> > > >  }
> > > >
> > > > +struct dma_mapping_ops *find_dma_ops_for_device(struct device *dev)
> > > > +{
> > > > +	int i;
> > > > +	unsigned long flags;
> > > > +	struct dma_mapping_ops *entry, *ops = NULL;
> > > > +
> > > > +	read_lock_irqsave(&dma_ops_list_lock, flags);
> > > > +
> > > > +	for (i = 0; i < DMA_OPS_TYPE_MAX; ++i)
> > > > +		list_for_each_entry(entry, &dma_ops_list[i], list) {
> > > > +			if (!entry->device_supported)
> > > > +				continue;
> > > > +			if (entry->device_supported(dev)) {
> > > > +				ops = entry;
> > > > +				goto out;
> > > > +			}
> > > > +		}
> > > > +out:
> > > > +	read_unlock_irqrestore(&dma_ops_list_lock, flags);
> > >
> > > For PVDMA, we want the "native" dma_ops to succeed first, eg, nommu, and
> > > then do our "PV DMA", which is just translating gpa  to hpa and then
> > > program the hardware. This isn't being done here. This can be done by
> > > extending the return type:
> > >
> > > DMA_DEV_NOT_SUPPORTED
> > > DMA_DEV_HANDLED
> > > DMA_DEV_PASS
> > >
> > > Where NOT_SUPPORTED means we should look for the next one in the chain
> > > (current return value 0), DEV_HANDLED means the dma operation has been
> > > handled successfully (current return value 1) and DEV_PASS means fall
> > > back to the next layer and then return back.
> >
> > I am not sure I fully understand what you mean? Why do we need to call
> > nommu handlers first for PVDMA devices?
> 
> For the usual dma_alloc_coherent, dma_map_single, etc. routines. They return 
> the gpa to the driver. We want to intercept this gpa and convert it to the 
> hpa before passing on the value to the driver. So our dma_alloc_coherent will 
> assume the real underlying alloc_coherent has succeeded and then make a 
> hypercall.
> 
> The PV dma_ops routines won't do the usual allocation, etc. that's already 
> done elsewhere.

Ok, the allocation only matters for dma_alloc_coherent. Fujita
introduced a generic software-based dma_alloc_coherent recently which
you can use for that. I think implementing PVDMA into an own dma_ops
backend and multiplex it using my patches introduces less overhead than
an additional layer over the current dma_ops implementation.

Another two questions to your approach: What happens if a
dma_alloc_coherent allocation crosses page boundarys and the gpa's are
not contiguous in host memory? How will dma masks be handled?

Joerg

-- 
           |           AMD Saxony Limited Liability Company & Co. KG
 Operating |         Wilschdorfer Landstr. 101, 01109 Dresden, Germany
 System    |                  Register Court Dresden: HRA 4896
 Research  |              General Partner authorized to represent:
 Center    |             AMD Saxony LLC (Wilmington, Delaware, US)
           | General Manager of AMD Saxony LLC: Dr. Hans-R. Deppe, Thomas McCoy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/