[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <23f232a1-f511-d2fe-b1f8-5fd32b3a1a8f@arm.com>
Date: Thu, 17 Mar 2022 13:42:56 +0000
From: Robin Murphy <robin.murphy@....com>
To: Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc: "michael.jamet@...el.com" <michael.jamet@...el.com>,
"linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"andreas.noever@...il.com" <andreas.noever@...il.com>,
"iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
"Limonciello, Mario" <Mario.Limonciello@....com>,
"YehezkelShB@...il.com" <YehezkelShB@...il.com>,
"hch@....de" <hch@....de>
Subject: Re: [PATCH] thunderbolt: Stop using iommu_present()
On 2022-03-17 08:08, Mika Westerberg wrote:
> Hi Robin,
>
> On Wed, Mar 16, 2022 at 07:17:57PM +0000, Robin Murphy wrote:
>> The feeling I'm getting from all this is that if we've got as far as
>> iommu_dma_protection_show() then it's really too late to meaningfully
>> mitigate bad firmware.
>
> Note, these are requirements from Microsoft in order for the system to
> use the "Kernel DMA protection". Because of this, likelyhood of "bad
> firmware" should be quite low since these systems ship with Windows
> installed so they should get at least some soft of validation that this
> actually works.
>
>> We should be able to detect missing
>> untrusted/external-facing properties as early as nhi_probe(), and if we
>> could go into "continue at your own risk" mode right then *before* anything
>> else happens, it all becomes a lot easier to reason about.
>
> I think what we want is that the DMAR opt-in bit is set in the ACPI
> tables and that we know the full IOMMU translation is happening for the
> devices behind "external facing ports". If that's not the case the
> iommu_dma_protection_show() should return 0 meaning the userspace can
> ask the user whether the connected device is allowed to use DMA (e.g
> PCIe is tunneled or not).
Ah, if it's safe to just say "no protection" in the case that we don't
know for sure, that's even better. Clearly I hadn't quite grasped that
aspect of the usage model, thanks for the nudge!
> We do check for the DMAR bit in the Intel IOMMU code and we also do
> check that there actually are PCIe ports marked external facing but we
> could issue warning there if that's not the case. Similarly if the user
> explicitly disabled the IOMMU translation. This can be done inside a new
> IOMMU API that does something like the below pseudo-code:
>
> #if IOMMU_ENABLED
> bool iommu_dma_protected(struct device *dev)
> {
> if (dmar_platform_optin() /* or the AMD equivalent */) {
> if (!iommu_present(...)) /* whatever is needed to check that the full translation is enabled */
> dev_warn(dev, "IOMMU protection disabled!");
> /*
> * Look for the external facing ports. Should be at
> * least 1 or issue warning.
> */
> ...
>
> return true;
> }
>
> return false;
> }
> #else
> static inline bool iommu_dma_protected(struct device *dev)
> {
> return false;
> }
> #endif
>
> Then we can make iommu_dma_protection_show() to call this function.
The problem that I've been trying to nail down here is that
dmar_platform_optin() really doesn't mean much for us - I don't know how
Windows' IOMMU drivers work, but there's every chance it's not the
same way as ours. The only material effect that dmar_platform_optin()
has for us is to prevent the user from disabling the IOMMU driver
altogether, and thus ensure that iommu_present() is true. Whether or not
we can actually trust the IOMMU driver to provide reliable protection
depends entirely on whether it knows the PCIe ports are external-facing.
If not, we can only *definitely* know what the IOMMU driver will do for
a given endpoint once that endpoint has appeared behind the port and
iommu_probe_device() has decided what its default domain should be, and
as far as I now understand, that's not an option for Thunderbolt since
it can only happen *after* the tunnel has been authorised and created.
Much as I'm tempted to de-scope back to my IOMMU API cleanup and run
away from the rest of the issue, I think I can crib enough from the
existing code to attempt a reasonable complete fix, so let me give that
a go...
Thanks,
Robin.
Powered by blists - more mailing lists