[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB4157FB57619785BAA50B3586D4A6A@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Thu, 4 Dec 2025 03:35:44 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Tianyu Lan <ltykernel@...il.com>, Christoph Hellwig <hch@...radead.org>,
Robin Murphy <robin.murphy@....com>
CC: "kys@...rosoft.com" <kys@...rosoft.com>, "haiyangz@...rosoft.com"
<haiyangz@...rosoft.com>, "wei.liu@...nel.org" <wei.liu@...nel.org>,
"decui@...rosoft.com" <decui@...rosoft.com>, "longli@...rosoft.com"
<longli@...rosoft.com>, "vdso@...bites.dev" <vdso@...bites.dev>, Tianyu Lan
<tiala@...rosoft.com>, "linux-hyperv@...r.kernel.org"
<linux-hyperv@...r.kernel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
Subject: RE: [RFC PATCH] Drivers: hv: Confidential VMBus exernal memory
support
From: Tianyu Lan <ltykernel@...il.com> Sent: Wednesday, December 3, 2025 6:21 AM
>
> On Sat, Nov 29, 2025 at 1:47 AM Michael Kelley <mhklinux@...look.com> wrote:
> >
> > From: Tianyu Lan <ltykernel@...il.com> Sent: Monday, November 24, 2025 10:29 AM
[snip]
> >
> > Here's my idea for an alternate approach. The goal is to allow use of the
> > swiotlb to be disabled on a per-device basis. A device is initialized for swiotlb
> > usage by swiotlb_dev_init(), which sets dev->dma_io_tlb_mem to point to the
> > default swiotlb memory. For VMBus devices, the calling sequence is
> > vmbus_device_register() -> device_register() -> device_initialize() ->
> > swiotlb_dev_init(). But if vmbus_device_register() could override the
> > dev->dma_io_tlb_mem value and put it back to NULL, swiotlb operations
> > would be disabled on the device. Furthermore, is_swiotlb_force_bounce()
> > would return "false", and the normal DMA functions would not force the
> > use of bounce buffers. The entire code change looks like this:
> >
> > --- a/drivers/hv/vmbus_drv.c
> > +++ b/drivers/hv/vmbus_drv.c
> > @@ -2133,11 +2133,15 @@ int vmbus_device_register(struct hv_device *child_device_obj)
> > child_device_obj->device.dma_mask = &child_device_obj->dma_mask;
> > dma_set_mask(&child_device_obj->device, DMA_BIT_MASK(64));
> >
> > + device_initialize(&child_device_obj->device);
> > + if (child_device_obj->channel->co_external_memory)
> > + child_device_obj->device.dma_io_tlb_mem = NULL;
> > +
> > /*
> > * Register with the LDM. This will kick off the driver/device
> > * binding...which will eventually call vmbus_match() and vmbus_probe()
> > */
> > - ret = device_register(&child_device_obj->device);
> > + ret = device_add(&child_device_obj->device);
> > if (ret) {
> > pr_err("Unable to register child device\n");
> > put_device(&child_device_obj->device);
> >
> > I've only compile tested the above since I don't have an environment where
> > I can test Confidential VMBus. You would need to verify whether my thinking
> > is correct and this produces the intended result.
>
> Thanks Michael. I tested it and it seems to hit an issue. Will double check.with
> HCL/paravisor team.
>
> We considered such a change before. From Roman's previous patch, it seems to
> need to change phys_to_dma() and force_dma_unencrypted().
In a Hyper-V SEV-SNP VM with a paravisor, I assert that phys_to_dma() and
__phys_to_dma() do the same thing. phys_to_dma() calls dma_addr_encrypted(),
which does __sme_set(). But in a Hyper-V VM using vTOM, sme_me_mask is
always 0, so dma_addr_encrypted() is a no-op. dma_addr_unencrypted() and
dma_addr_canonical() are also no-ops. See include/linux/mem_encrypt.h. So
in a Hyper-V SEV-SNP VM, the DMA layer doesn't change anything related to
encryption when translating between a physical address and a DMA address.
Same thing is true for a Hyper-V TDX VM with paravisor.
force_dma_unencrypted() will indeed return "true", and it is used in
phys_to_dma_direct(). But both return paths in phys_to_dma_direct() return the
same result because of dma_addr_unencrypted() and dma_addr_encrypted()
being no-ops. Other uses of force_dma_unencrypted() are only in the
dma_alloc_*() paths, but dma_alloc_*() isn't used by VMBus devices because
the device control structures are in the ring buffer, which as you have noted, is
already handled separately. So for the moment, I don't think the return value
from force_dma_unencrypted() matters.
So I'm guessing something else unexpected is happening such that just disabling
the swiotlb on a per-device basis doesn't work. Assuming that Roman's original
patch actually worked, I'm trying to figure out how my idea is different in a way
that has a material effect on things. And if your patch works by going directly to
__phys_to_dma(), it should also work when using phys_to_dma() instead.
I will try a few experiments on a normal Confidential VM (i.e., without Confidential
VMBus) to confirm that my conclusions from reading the code really are correct.
FWIW, I'm looking at the linux-next20251119 code base.
Michael
>
> >
> > Directly setting dma_io_tlb_mem to NULL isn't great. It would be better
> > to add an exported function swiotlb_dev_disable() to swiotlb code that sets
> > dma_io_tlb_mem to NULL, but you get the idea.
> >
> > Other reviewers may still see this approach as a bit of a hack, but it's a lot
> > less of a hack than introducing Hyper-V specific DMA functions.
> > swiotlb_dev_disable() is conceptually needed for TDISP devices, as TDISP
> > devices must similarly protect confidentiality by not allowing use of the swiotlb.
> > So adding swiotlb_dev_disable() is a step in the right direction, even if the
> > eventual TDISP code does it slightly differently. Doing the disable on a
> > per-device basis is also the right thing in the long run.
> >
Powered by blists - more mailing lists