[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f54f46399aa2d0066231d95ef9e98526cf217115.camel@infradead.org>
Date: Mon, 07 Apr 2025 08:54:46 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: Christoph Hellwig <hch@...radead.org>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, virtio-comment@...ts.linux.dev,
Claire Chang <tientzu@...omium.org>, linux-devicetree
<devicetree@...r.kernel.org>, Rob Herring <robh+dt@...nel.org>,
Jörg Roedel <joro@...tes.org>,
iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
graf@...zon.de
Subject: Re: [RFC PATCH 1/3] content: Add VIRTIO_F_SWIOTLB to negotiate use
of SWIOTLB bounce buffers
On Mon, 2025-04-07 at 00:30 -0700, Christoph Hellwig wrote:
> On Fri, Apr 04, 2025 at 12:15:52PM +0100, David Woodhouse wrote:
> > We could achieve that by presenting the device with a completely new
> > PCI device/vendor ID so that old drivers don't match, or in the DT
> > model you could make a new "compatible" string for it. I chose to use a
> > VIRTIO_F_ bit for it instead, which seemed natural and allows the
> > device model (under the influence of the system integrator) to *choose*
> > whether a failure to negotiate such bit is fatal or not.
>
> Stop thinking about devices. Your CoCo VM will have that exact same
> limitation for all devices, because none of them can DMA into random
> memory.
Nah, most of them are just fine because they're actual passthrough PCI
devices behind a proper 2-stage IOMMU.
> > So on x86 it might be an e820-reserved region for example.
>
> Hasn't e820 replaced with something more "elaborate" for UEFI systems
> anyway?
Sure, it's no longer literally a BIOS call with 0xe820 in any registers
but Linux still calls it that.
> > I don't think we want the guest OS just *assuming* that there's usable
> > memory in that e820-reserved region, just because some device says that
> > it's actually capable of DMA to those addresses.
> >
> > So it would probably want a separate object, like the separate
> > `restricted-dma-pool` in DT, which explicitly identifies that range as
> > a DMA bounce-buffer pool. We probably *can* do that even in ACPI with a
> > PRP0001 device today, using a `restricted-dma-pool` compatible
> > property.
> >
> > Then the OS would need to spot this range in the config space, and say
> > "oh, I *do* have a swiotlb pool this device can reach", and use that.
>
> Yes, that's largely how it should work.
The problem in ACPI is matching the device to that SWIOTLB pool. I
think we can expose a `restricted-dma-pool` node via PRP0001 but then
we need to associate a particular device (or set of devices) to that
pool. In DT we do that by referencing it from a `memory-region` node of
the device itself.
The idea above was that the affected devices would state that they are
only capable of DMA to that range, and that's how we'd match them up.
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5069 bytes)
Powered by blists - more mailing lists