lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c08d3fd2bdae1b0fa629ecd9261a5ca9549ce9aa.camel@infradead.org>
Date: Mon, 07 Apr 2025 11:09:54 +0100
From: David Woodhouse <dwmw2@...radead.org>
To: Christoph Hellwig <hch@...radead.org>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, virtio-comment@...ts.linux.dev, 
 Claire Chang <tientzu@...omium.org>, linux-devicetree
 <devicetree@...r.kernel.org>, Rob Herring <robh+dt@...nel.org>,
 Jörg Roedel <joro@...tes.org>, 
 iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
 graf@...zon.de
Subject: Re: [RFC PATCH 1/3] content: Add VIRTIO_F_SWIOTLB to negotiate use
 of SWIOTLB bounce buffers

On Mon, 2025-04-07 at 02:05 -0700, Christoph Hellwig wrote:
> On Mon, Apr 07, 2025 at 08:54:46AM +0100, David Woodhouse wrote:
> > On Mon, 2025-04-07 at 00:30 -0700, Christoph Hellwig wrote:
> > > On Fri, Apr 04, 2025 at 12:15:52PM +0100, David Woodhouse wrote:
> > > > We could achieve that by presenting the device with a completely new
> > > > PCI device/vendor ID so that old drivers don't match, or in the DT
> > > > model you could make a new "compatible" string for it. I chose to use a
> > > > VIRTIO_F_ bit for it instead, which seemed natural and allows the
> > > > device model (under the influence of the system integrator) to *choose*
> > > > whether a failure to negotiate such bit is fatal or not.
> > > 
> > > Stop thinking about devices.  Your CoCo VM will have that exact same
> > > limitation for all devices, because none of them can DMA into random
> > > memory.
> > 
> > Nah, most of them are just fine because they're actual passthrough PCI
> > devices behind a proper 2-stage IOMMU.
> 
> Except for all virtual devices.

Yes, that's what I'm saying.

And that's also why it's reasonable to have a solution which handles
this for virtio devices, without necessarily having to handle it for
*arbitrary* emulated PCI devices across the whole system, and without
having to change core concepts of DMA handling across all operating
systems.

This isn't just about Linux guests, and especially not just about Linux
guests running running 6.16+ kernels.

A solution which can live in a device driver is a *lot* easier to
actually get into the hands of users. Not just Windows users, but even
the slower-moving Linux distros.

> > > > Then the OS would need to spot this range in the config space, and say
> > > > "oh, I *do* have a swiotlb pool this device can reach", and use that.
> > > 
> > > Yes, that's largely how it should work.
> > 
> > The problem in ACPI is matching the device to that SWIOTLB pool. I
> > think we can expose a `restricted-dma-pool` node via PRP0001 but then
> > we need to associate a particular device (or set of devices) to that
> > pool. In DT we do that by referencing it from a `memory-region` node of
> > the device itself.
> 
> I don't think you actually _need_ to have an explicity device vs pool
> match.  All pools in host memory (assuming there is more than one)
> should be usable for all devices bar actual addressing limits that are
> handled in the dma layer already.  The only things you need is:
> 
>  a) a way to declare one or more pools
>  b) a way to destinguish between devices behind a two stage IOMMU vs not
>     to figure out if they need to use a pool

I'm not averse to that, but it's different to the `restricted-dma-pool`
model that's defined today which has explicit matching. So I'd like to
reconcile them — and preferably literally use PRP0001 to expose
`restricted-dma-pool` even under ACPI.

Maybe it's as simple as a flag/property on the `restricted-dma-pool`
node which declares that it's a 'catch-all', and that *all* devices
which aren't explicitly bound to an IOMMU or other DMA operations (e.g.
explicitly bound to a different restricted-dma-pool) should use it?

That's actually what of_dma_configure_id() does today even with the
existing `restricted-dma-pool` which *is* explicitly matched to a
device; the pool still only gets used if of_configure_iommu() doesn't
find a better option.

But I would also be entirely OK with following the existing model and
having the virtio device itself provide a reference to the restricted-
dma-pool. Either by its address, or by some other handle/reference.

Referencing it by address, which is what Michael and I were discussing,
is simple enough. And in this model it *is* a device restriction — that
virtio device knows full well that it can only perform DMA to that
range of addresses.

And it also allows for a standalone device driver to go check for the
corresponding ACPI device, claim that memory and set up the bounce
buffering for *itself*, in operating systems which don't support it.
Although I suppose a standalone device driver can do the same even in
the 'catch-all' model.

Either way, we'd still ideally want a virtio feature bit to say "don't
touch me if you don't understand my DMA restrictions", to prevent older
drivers (on older operating systems) from failing.

(I know you say that's a 'restriction' not a 'feature', but that's just
fine. That's why it's a *negotiation* not simply the device advertising
optional features which the driver may choose to use, or ignore, as it
sees fit.)

Download attachment "smime.p7s" of type "application/pkcs7-signature" (5069 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ