[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8d1069c-ebbc-4700-adf8-69810bef6c0a@linux.intel.com>
Date: Thu, 5 Sep 2024 10:49:43 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: "Tian, Kevin" <kevin.tian@...el.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>
Cc: baolu.lu@...ux.intel.com, "Saarinen, Jani" <jani.saarinen@...el.com>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH 1/1] iommu/vt-d: Prevent boot failure with devices
requiring ATS
On 9/4/24 4:17 PM, Tian, Kevin wrote:
>> From: Baolu Lu <baolu.lu@...ux.intel.com>
>> Sent: Wednesday, September 4, 2024 3:50 PM
>>
>> On 2024/9/4 14:49, Tian, Kevin wrote:
>>>> From: Lu Baolu <baolu.lu@...ux.intel.com>
>>>> Sent: Wednesday, September 4, 2024 2:07 PM
>>>>
>>>> SOC-integrated devices on some platforms require their PCI ATS enabled
>>>> for operation when the IOMMU is in scalable mode. Those devices are
>>>> reported via ACPI/SATC table with the ATC_REQUIRED bit set in the Flags
>>>> field.
>>>>
>>>> The PCI subsystem offers the 'pci=noats' kernel command to disable PCI
>>>> ATS on all devices. Using 'pci=noat' with devices that require PCI ATS
>>>> can cause a conflict, leading to boot failure, especially if the device
>>>> is a graphics device.
>>>>
>>>> To prevent this issue, check PCI ATS support before enumerating the
>> IOMMU
>>>> devices. If any device requires PCI ATS, but PCI ATS is disabled by
>>>> 'pci=noats', switch the IOMMU to operate in legacy mode to ensure
>>>> successful booting.
>>>
>>> I guess the reason of switching to legacy mode is because the platform
>>> automatically enables ATS in this mode, as the comment says in
>>> dmar_ats_supported(). This should be explained otherwise it's unclear
>>> why switching the mode can make ATS working for those devices.
>>
>> Not 'automatically enable ATS,' but hardware provides something that is
>> equivalent to PCI ATS. The ATS capability on the device is still
>> disabled. That's the reason why such device must be an SOC-integrated
>> one.
>
> well does that equivalent means use PCI ATS protocol at all (i.e. do
> untranslated request followed by translated request based on device
> TLB)?
>
> If yes it's still ATS under the hood.
>
> If not could you elaborate how it works in PCI world?
I'm not a hardware expert, so I can't provide specific details. :-)
Anyway, from the Linux box's perspective, if 'pci=noats' is used on a
Meteorlake device, the 'lspci' tool shows that PCI ATS is disabled:
# dmesg
[...]
[ 2.419806] pci 0000:00:02.0: DMAR: PCI/ATS not supported, system
working in IOMMU legacy mode
[...]
# lspci -s 0000:00:02.0 -vv
00:02.0 VGA compatible controller: Intel Corporation Meteor Lake-M
[Intel Graphics] (prog-if 00 [VGA controller])
[...]
Capabilities: [200 v1] Address Translation Service (ATS)
ATSCap: Invalidate Queue Depth: 00
ATSCtl: Enable-, Smallest Translation Unit: 00
[...]
As for how hardware works, it appears to be transparent to the software.
>
>>
>>>
>>> But then doesn't it break the meaning of 'pci=noats' which means
>>> disabling ATS physically? It's described as "do not use PCIe ATS and
>>> IOMMU device IOTLB" in kernel doc, which is not equivalent to
>>> "leave PCIe ATS to be managed by HW".
>>
>> Therefore, the PCI ATS is not used and the syntax of pci=noats is not
>> broken.
>
> I'm not sure the point of noats is to just disable the PCI capability
> while allowing the underlying hw to continue sending ATS protocol...
>
>>
>>> and why would one want to use 'pci=noats' on a platform which
>>> requires ats?
>>
>> We don't recommend users to disable ATS on a platform which has devices
>> that rely on it. But nothing can prevent users from doing so. I am not
>> sure why it is needed. One possible reason that I can think of is about
>> security. Sometimes, people don't trust ATS because it allows devices to
>> access the memory with translated requests directly without any
>> permission check on the IOMMU end.
>>
>
> but this doesn't make sense. If the user doesn't trust ATS and deliberately
> wants to disable ats then it should be followed and whatever usage
> requiring ATS is then broken. The user should decide which is more
> favored between security vs. usage to make the right call.
Yes. That's just a reason that I could think of.
Anyway, for a client platform, we should avoid any boot failure no
matter what kernel parameters the users are using.
Instead, perhaps we could emit a big fat warning to the user, informing
them that a device relies on ATS for functionality, thus 'pci=noats'
might be compromised.
Thanks,
baolu
Powered by blists - more mailing lists