[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180613164500-mutt-send-email-mst@kernel.org>
Date:   Wed, 13 Jun 2018 16:59:41 +0300
From:   "Michael S. Tsirkin" <mst@...hat.com>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Ram Pai <linuxram@...ibm.com>, robh@...nel.org,
        pawel.moll@....com, Tom Lendacky <thomas.lendacky@....com>,
        aik@...abs.ru, jasowang@...hat.com, cohuck@...hat.com,
        linux-kernel@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, joe@...ches.com,
        "Rustad, Mark D" <mark.d.rustad@...el.com>,
        david@...son.dropbear.id.au, linuxppc-dev@...ts.ozlabs.org,
        elfring@...rs.sourceforge.net,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>
Subject: Re: [RFC V2] virtio: Add platform specific DMA API translation for
 virito devices
On Wed, Jun 13, 2018 at 12:41:41AM -0700, Christoph Hellwig wrote:
> On Mon, Jun 11, 2018 at 01:29:18PM +1000, Benjamin Herrenschmidt wrote:
> > At the risk of repeating myself, let's just do the first pass which is
> > to switch virtio over to always using the DMA API in the actual data
> > flow code, with a hook at initialization time that replaces the DMA ops
> > with some home cooked "direct" ops in the case where the IOMMU flag
> > isn't set.
> > 
> > This will be equivalent to what we have today but avoids having 2
> > separate code path all over the driver.
> > 
> > Then a second stage, I think, is to replace this "hook" so that the
> > architecture gets a say in the matter.
> 
> I don't think we can actually use dma_direct_ops.  It still allows
> architectures to override parts of the dma setup, which virtio seems
> to blindly assume phys == dma and not cache flushing.
> 
> I think the right way forward is to either add a new
> VIRTIO_F_IS_PCI_DEVICE (or redefine the existing iommu flag if deemed
> possible).
Given this is exactly what happens now, this seems possible, but maybe
we want a non-PCI specific name.
>  And then make sure recent qemu always sets it.
I don't think that part is going to happen, sorry.
Hypervisors can set it when they *actually have* a real PCI device.
People emulate systems which have a bunch of overhead in the DMA API
which is required for real DMA. Your proposal would double that overhead
by first doing it in guest then re-doing it in host.
I don't think it's justified when 99% of the world doesn't need it.
-- 
MST
Powered by blists - more mailing lists
 
