[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aQyBIohAuxNHV-XI@google.com>
Date: Thu, 6 Nov 2025 11:06:11 +0000
From: Mostafa Saleh <smostafa@...gle.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Will Deacon <will@...nel.org>, linux-kernel@...r.kernel.org,
kvmarm@...ts.linux.dev, linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux.dev, maz@...nel.org, oliver.upton@...ux.dev,
joey.gouly@....com, suzuki.poulose@....com, yuzenghui@...wei.com,
catalin.marinas@....com, robin.murphy@....com,
jean-philippe@...aro.org, qperret@...gle.com, tabba@...gle.com,
mark.rutland@....com, praan@...gle.com
Subject: Re: [PATCH v4 15/28] iommu/arm-smmu-v3: Load the driver later in KVM
mode
On Wed, Nov 05, 2025 at 01:12:08PM -0400, Jason Gunthorpe wrote:
> On Wed, Nov 05, 2025 at 04:40:26PM +0000, Mostafa Saleh wrote:
> > However, that didn’t work because, as from Linux perspective the
> > nested driver was bound to all the SMMUs which means that any
> > device that is connected to an SMMUv3 has its dependencies met, which
> > caused those drivers to start probing without IOMMU ops.
>
> ??
>
> What code is doing this?
>
> If a struct device gets a fwspec attached to it then it should not
> permit any driver to probe until iommu_init_device() has
> succeeded. This broadly needs to work to support iommu drivers as
> modules that are loaded by the initrd.
>
> So the general principal of causing devices to not progress should
> already be there and work, if it doesn't then maybe it needs some
> fixing.
>
> I expect iommu_init_device() to fail on devices up until the actual
> iommu driver is loaded. iommu_fwspec_ops() should fail because
> iommu_from_fwnode() will not find fwnode in the iommu_device_list
> until the iommu subsystem driver is bound, the kvm driver cannot
> supply this.
>
> So where do things go wrong for you?
Thanks for the explanation, I had a closer look, and indeed I was
confused, iommu_init_device() was failing because of .probe_device().
Because of device_set_node(), now both devices have the same fwnode,
so bus_find_device_by_fwnode() from arm_smmu_get_by_fwnode() was returning
the wrong device.
driver_find_device_by_fwnode() seems to work, but that makes me question
the reliability of this approach.
>
> > It seems device links are not the write tool to use.
>
> Yes
>
> > So far, the requirements we need to satisfy are:
> > 1- No driver should bind to the SMMUs before KVM initialises.
>
> Using the above I'd expect a sequence where the KVM SMMU driver loads
> first, it does it's bit, then once KVM is happy it creates the actual
> SMMU driver which registers in iommu_device_list and triggers driver
> binding.
>
> This is basically an identical sequence to loading an iommu driver
> from the initrd - just the trigger for the delayed load is the kvm
> creating the device, not udev runnign.
SMMUv3 driver as a module won't be a problem as modules are loaded later
after KVM initialises. The problem is mainly with the SMMUv3 driver
built-in, I don't think there is a way to delay loading of the driver,
besides this patch, which registers the driver later in case of KVM.
>
> > 2- Check if KVM is initialised from the SMMUv3 driver,
> > if not -EPROBE_DEFER (as Will suggested), that will guarded by the
> > KVM driver macro and cmdline to enable protected mode.
>
> SMMUv3 driver shouldn't even be bound until KVM is ready and it is an
> actual working driver? Do this by not creating the struct device until
> it is ready.
>
> Also Greg will not like if you use platform devices here, use an aux
> device..
>
But I am not sure if it is possible with built-in drivers to delay
the binding.
Also, I had to use platform devices for this, as the KVM driver binds
to the actual SMMUv3 nodes, and then duplicates them so the SMMUv3
driver can bind to the duplicate nodes, where the KVM devices are the
parent, but this approach seems complicated, besides the problems
mentioned above.
The other approach would be to keep defering in case of KVM:
@@ -4454,6 +4454,10 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
struct arm_smmu_device *smmu;
struct device *dev = &pdev->dev;
+ if (IS_ENABLED(CONFIG_ARM_SMMU_V3_PKVM) && is_protected_kvm_enabled() &&
+ !static_branch_unlikely(&kvm_protected_mode_initialized))
+ return -EPROBE_DEFER;
That works for me. And if we want to back the KVM driver with device I was
thinking we can rely on impl_ops, that has 2 benefits:
1- The SMMUv3 devices can be the parent instead of KVM.
2- The KVM devices can be faux/aux as they are not coming from FW and
don't need to be on the platform bus.
And this is simpler.
Besides this approach and the one in this patch, I don't see a simple way
of achieving this without adding extra support in the driver model/platform
bus to express such dependency.
Thanks,
Mostafa
> Jason
Powered by blists - more mailing lists