[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250406114130-mutt-send-email-mst@kernel.org>
Date: Sun, 6 Apr 2025 14:28:08 -0400
From: "Michael S. Tsirkin" <mst@...hat.com>
To: David Woodhouse <dwmw2@...radead.org>
Cc: Christoph Hellwig <hch@...radead.org>, virtio-comment@...ts.linux.dev,
Claire Chang <tientzu@...omium.org>,
linux-devicetree <devicetree@...r.kernel.org>,
Rob Herring <robh+dt@...nel.org>,
Jörg Roedel <joro@...tes.org>,
iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
graf@...zon.de
Subject: Re: [RFC PATCH 1/3] content: Add VIRTIO_F_SWIOTLB to negotiate use
of SWIOTLB bounce buffers
On Fri, Apr 04, 2025 at 12:15:52PM +0100, David Woodhouse wrote:
> On Fri, 2025-04-04 at 06:37 -0400, Michael S. Tsirkin wrote:
> > On Fri, Apr 04, 2025 at 11:15:33AM +0100, David Woodhouse wrote:
> > > On Fri, 2025-04-04 at 10:27 +0100, David Woodhouse wrote:
> > > > On 4 April 2025 09:32:39 BST, "Michael S. Tsirkin" <mst@...hat.com>
> > > > wrote:
> > > > > On Fri, Apr 04, 2025 at 09:16:44AM +0100, David Woodhouse wrote:
> > > > > > On Fri, 2025-04-04 at 04:09 -0400, Michael S. Tsirkin wrote:
> > > > > > > On Fri, Apr 04, 2025 at 08:50:47AM +0100, David Woodhouse
> > > > > > > wrote:
> > > > > > > > What's annoying is that this should work out of the box
> > > > > > > > *already* with
> > > > > > > > virtio-mmio and a `restricted-dma-pool` — for systems which
> > > > > > > > aren't
> > > > > > > > afflicted by UEFI/ACPI/PCI as their discovery mechanisms.
> > > > > > >
> > > > > > >
> > > > > > > That specifically would be just a driver bugfix then?
> > > > > >
> > > > > > I actually think it works out of the box and there isn't even a
> > > > > > bug to
> > > > > > fix. Haven't tested yet.
> > > > > >
> > > > > > The sad part is that the system does it all automatically *if* it
> > > > > > has
> > > > > > CONFIG_DMA_RESTRICTED_POOL (e.g. Linux) and the driver never even
> > > > > > notices that the dma_ops it's using are the swiotlb ops using the
> > > > > > provided buffer.
> > > > > >
> > > > > > Which is *kind* of nice... except that when on a guest OS which
> > > > > > *isn't*
> > > > > > Linux with CONFIG_DMA_RESTRICTED_POOL, the guest will just ignore
> > > > > > the
> > > > > > `restricted-dma-pool` node and try DMA to system memory anyway,
> > > > > > which
> > > > > > will fail.
> > > > >
> > > > > I mean, it's easy to misconfigure Linux, this is why we love it ;)
> > > > > Why
> > > > > is this such a concern?
> > > >
> > > > Because it's incompatible. In the DT world, perhaps this new *non-
> > > > optional* feature/restriction should have come with a new
> > > > "compatible" string such as "virtio-mmio-restricted-dma".
> > > >
> > > > Adding it without backwards compatibility wasn't ideal.
> > > >
> > > > > > That's why my proposal adds the negotiated VIRTIO_F_SWIOTLB
> > > > > > feature, so
> > > > > > that the device side can refuse, if the guest *isn't* agreeing to
> > > > > > use
> > > > > > the bounce buffer in the situations where it must do so.
> > > > >
> > > > >
> > > > > OTOH then setting this feature and if you make the device force it,
> > > > > you are breaking guests restricted-dma-pool which worked
> > > > > previously, no?
> > > >
> > > > Yes. So a platform offering virtio-mmio with restricted DMA, if the
> > > > driver doesn't accept the offered VIRTIO_F_SWIOTLB, may want to
> > > > accept that negotiation anyway, and *hope* that the driver/OS are
> > > > going to use the buffer anyway.
> > > >
> > > > I just didn't want to make that same mistake again when formalising
> > > > and documenting this, and especially when attempting to extend it to
> > > > PCI.
> > >
> > > Of course, the beauty of the restricted-dma-pool as supported by DT is
> > > that it's a *system* memory buffer, which is actually OK as long as
> > > it's reserved address space and not just part of normal system memory
> > > that an unsuspecting guest might use for general purposes. So the
> > > trusted part of the hypervisor (e.g. pKVM) can *allow* the VMM access
> > > to that space.
> > >
> > > It doesn't *have* to be on-device. That just seemed like the more
> > > natural way to do it for PCI.
> > >
> > > I suppose we *could* allow for the virtio-pci transport to do it the
> > > same way as virtio-mmio though. The VIRTIO_PCI_CAP_SWIOTLB capability¹
> > > could reference a range of system memory space, just like the
> > > `restricted-dma-pool` property does.
> > >
> > > It's a weird abstraction especially for a physical PCI device to do
> > > that because the system memory space is outside its ownership. But in a
> > > physical device it could be writable, and you could consider it the
> > > responsibility of the system firmware to configure it appropriately, in
> > > accordance with the IOMMU and other DMA restrictions of the platform.
> > >
> > > That does solve it for the CoCo case without addressing the P2P staging
> > > case that Christoph mentions, though.
> > >
> > >
> > > ¹ I will rename it, Christoph, if it survives at all. Probably
> > > VIRTIO_F_RESTRICTED_DMA and VIRTIO_PCI_CAP_RESTRICTED_DMA but of course
> > > it depends on the semantics we conclude it should have.
> >
> > OK. So basically, all this does, is a promise by driver to only
> > DMA into a range of memory?
>
> Basically, yes.
>
> > This part, I get. I wouldn't put it in a capability, just in config
> > space then.
>
> Sure... but how? There are some things which are defined to be at fixed
> locations in config space, like the vendor/device IDs, COMMAND, STATUS,
> BARs, etc..
>
> And for the rest of the optional things which might be in config space
> of a given device... isn't that exactly what capabilities are for?
Sorry I am unclear. Not the pci config space. The virtio config space.
After admin_queue_num ?
Powered by blists - more mailing lists