[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180607052306.GA1532@infradead.org>
Date: Wed, 6 Jun 2018 22:23:06 -0700
From: Christoph Hellwig <hch@...radead.org>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
Ram Pai <linuxram@...ibm.com>, robh@...nel.org, aik@...abs.ru,
jasowang@...hat.com, linux-kernel@...r.kernel.org,
virtualization@...ts.linux-foundation.org, hch@...radead.org,
joe@...ches.com, linuxppc-dev@...ts.ozlabs.org,
elfring@...rs.sourceforge.net, david@...son.dropbear.id.au,
cohuck@...hat.com, pawel.moll@....com,
Tom Lendacky <thomas.lendacky@....com>,
"Rustad, Mark D" <mark.d.rustad@...el.com>
Subject: Re: [RFC V2] virtio: Add platform specific DMA API translation for
virito devices
On Thu, May 31, 2018 at 08:43:58PM +0300, Michael S. Tsirkin wrote:
> Pls work on a long term solution. Short term needs can be served by
> enabling the iommu platform in qemu.
So, I spent some time looking at converting virtio to dma ops overrides,
and the current virtio spec, and the sad through I have to tell is that
both the spec and the Linux implementation are complete and utterly fucked
up.
Both in the flag naming and the implementation there is an implication
of DMA API == IOMMU, which is fundamentally wrong.
The DMA API does a few different things:
a) address translation
This does include IOMMUs. But it also includes random offsets
between PCI bars and system memory that we see on various
platforms. Worse so some of these offsets might be based on
banks, e.g. on the broadcom bmips platform. It also deals
with bitmask in physical addresses related to memory encryption
like AMD SEV. I'd be really curious how for example the
Intel virtio based NIC is going to work on any of those
plaforms.
b) coherency
On many architectures DMA is not cache coherent, and we need
to invalidate and/or write back cache lines before doing
DMA. Again, I wonder how this is every going to work with
hardware based virtio implementations. Even worse I think this
is actually broken at least for VIVT event for virtualized
implementations. E.g. a KVM guest is going to access memory
using different virtual addresses than qemu, vhost might throw
in another different address space.
c) bounce buffering
Many DMA implementations can not address all physical memory
due to addressing limitations. In such cases we copy the
DMA memory into a known addressable bounc buffer and DMA
from there.
d) flushing write combining buffers or similar
On some hardware platforms we need workarounds to e.g. read
from a certain mmio address to make sure DMA can actually
see memory written by the host.
All of this is bypassed by virtio by default despite generally being
platform issues, not particular to a given device.
Powered by blists - more mailing lists