[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36ce098c-4862-4490-a3ed-3226c08aa6d1@linux.intel.com>
Date: Wed, 30 Oct 2024 16:32:46 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Yi Liu <yi.l.liu@...el.com>, Joel Granados <joel.granados@...nel.org>
Cc: baolu.lu@...ux.intel.com, David Woodhouse <dwmw2@...radead.org>,
Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
Robin Murphy <robin.murphy@....com>, Jason Gunthorpe <jgg@...pe.ca>,
Kevin Tian <kevin.tian@...el.com>, Klaus Jensen <its@...elevant.dk>,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
Klaus Jensen <k.jensen@...sung.com>
Subject: Re: [PATCH v4 2/5] iommu/vt-d: Remove the pasid present check in
prq_event_thread
On 2024/10/30 13:51, Yi Liu wrote:
> On 2024/10/29 13:39, Baolu Lu wrote:
>> On 2024/10/29 13:13, Yi Liu wrote:
>>> On 2024/10/29 11:12, Baolu Lu wrote:
>>>> On 2024/10/28 18:24, Joel Granados wrote:
>>>>> On Mon, Oct 28, 2024 at 03:50:46PM +0800, Yi Liu wrote:
>>>>>> On 2024/10/16 05:08, Joel Granados wrote:
>>>>>>> From: Klaus Jensen<k.jensen@...sung.com>
>>>>>>>
>>>>>>> PASID is not strictly needed when handling a PRQ event; remove
>>>>>>> the check
>>>>>>> for the pasid present bit in the request. This change was not
>>>>>>> included
>>>>>>> in the creation of prq.c to emphasize the change in capability
>>>>>>> checks
>>>>>>> when handing PRQ events.
>>>>>>>
>>>>>>> Signed-off-by: Klaus Jensen<k.jensen@...sung.com>
>>>>>>> Reviewed-by: Kevin Tian<kevin.tian@...el.com>
>>>>>>> Signed-off-by: Joel Granados<joel.granados@...nel.org>
>>>>>> looks like the PRQ draining is missed for the PRI usage. When a pasid
>>>>>> entry is destroyed, it might need to add helper similar to the
>>>>>> intel_drain_pasid_prq() to drain PRQ for the non-pasid usage.
>>>>> These types of user space PRIs (non-pasid, non-svm) are created by
>>>>> making use of iommufd_hwpt_replace_device. Which adds an entry to the
>>>>> pasid_array indexed on IOMMU_NO_PASID (0U) via the following path:
>>>>>
>>>>> iommufd_hwpt_replace_device
>>>>> -> iommufd_fault_domain_repalce_dev
>>>>> -> __fault_domain_replace_dev
>>>>> -> iommu_replace_group_handle
>>>> -> __iommu_group_set_domain
>>>> -> intel_iommu_attach_device
>>>> -> device_block_translation
>>>> -> intel_pasid_tear_down_entry(IOMMU_NO_PASID)
>>>>
>>>> Here a domain is removed from the pasid entry, hence we need to flush
>>>> all page requests that are pending in the IOMMU page request queue or
>>>> the PCI fabric.
>>>>
>>>>> -> xa_reserve(&group->pasid_array, IOMMU_NO_PASID,
>>>>> GFP_KERNEL);
>>>>>
>>>>> It is my understanding that this will provide the needed relation
>>>>> between the device and the prq in such a way that when
>>>>> remove_dev_pasid
>>>>> is called, intel_iommu_drain_pasid_prq will be called with the
>>>>> appropriate pasid value set to IOMMU_NO_PASID. Please correct me if
>>>>> I'm
>>>>> mistaken.
>>>>
>>>> Removing a domain from a RID and a PASID are different paths.
>>>> Previously, this IOMMU driver only supported page requests on PASID
>>>> (non-IOMMU_NO_PASID). It is acceptable that it does not flush the
>>>> PRQ in
>>>> the domain-removing RID path.
>>>>
>>>> With the changes made in this series, the driver now supports page
>>>> requests for RID. It should also flush the PRQ when removing a domain
>>>> from a PASID entry for IOMMU_NO_PASID.
>>>>
>>>>>
>>>>> Does this answer your question? Do you have a specific path that
>>>>> you are
>>>>> looking at where a specific non-pasid drain is needed?
>>>>
>>>> Perhaps we can simply add below change.
>>>>
>>>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>>>> index e860bc9439a2..a24a42649621 100644
>>>> --- a/drivers/iommu/intel/iommu.c
>>>> +++ b/drivers/iommu/intel/iommu.c
>>>> @@ -4283,7 +4283,6 @@ static void
>>>> intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
>>>> intel_iommu_debugfs_remove_dev_pasid(dev_pasid);
>>>> kfree(dev_pasid);
>>>> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
>>>> - intel_drain_pasid_prq(dev, pasid);
>>>> }
>>>>
>>>> static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
>>>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
>>>> index 2e5fa0a23299..8639f3eb4264 100644
>>>> --- a/drivers/iommu/intel/pasid.c
>>>> +++ b/drivers/iommu/intel/pasid.c
>>>> @@ -265,6 +265,7 @@ void intel_pasid_tear_down_entry(struct
>>>> intel_iommu *iommu, struct device *dev,
>>>> iommu->flush.flush_iotlb(iommu, did, 0, 0,
>>>> DMA_TLB_DSI_FLUSH);
>>>>
>>>> devtlb_invalidation_with_pasid(iommu, dev, pasid);
>>>> + intel_drain_pasid_prq(dev, pasid);
>>>> }
>>>>
>>>> /*
>>>> diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
>>>> index 078d1e32a24e..ff88f31053d1 100644
>>>> --- a/drivers/iommu/intel/svm.c
>>>> +++ b/drivers/iommu/intel/svm.c
>>>> @@ -304,9 +304,6 @@ void intel_drain_pasid_prq(struct device *dev,
>>>> u32 pasid)
>>>> int qdep;
>>>>
>>>> info = dev_iommu_priv_get(dev);
>>>> - if (WARN_ON(!info || !dev_is_pci(dev)))
>>>> - return;
>>>> -
>>>> if (!info->pri_enabled)
>>>> return;
>>>>
>>>> Generally, intel_drain_pasid_prq() should be called if
>>>>
>>>> - a translation is removed from a pasid entry; and
>>>> - PRI on this device is enabled.
>>>
>>> If the @pasid==IOMMU_NO_PASID, PRQ drain should use the iotlb
>>> invalidation
>>> and dev-tlb invalidation descriptors. So extra code change is needed in
>>> intel_drain_pasid_prq(). Or perhaps it's better to have a separate
>>> helper
>>> for draining prq for non-pasid case.
>>
>> According to VT-d spec, section 7.10, "Software Steps to Drain Page
>> Requests & Responses", we can simply replace p_iotlb_inv_dsc and
>> p_dev_tlb_inv_dsc with iotlb_inv_dsc and dev_tlb_inv_dsc. Any
>> significant negative performance impact?
>
> It's not about performance impact. My point is to use iotlb_inv_dsc and
> dev_tlb_inv_dsc for the @pasid==IOMMU_NO_PASID case. The existing
> intel_drain_pasid_prq() only uses p_iotlb_inv_dsc and p_dev_tlb_inv_dsc.
> The way you described in above reply works. But it needs to add if/else
> to use the correct invalidation descriptor. Since the descriptor
> composition has several lines, so just an ask if it's better to have a
> separate helper. :)
The spec says (7.10 Software Steps to Drain Page Requests & Responses):
"
Submit an IOTLB invalidate descriptor (iotlb_inv_dsc or p_iotlb_inv_dsc)
followed by DeviceTLB invalidation descriptor (dev_tlb_inv_dsc or
p_dev_tlb_inv_dsc) targeting the endpoint device. These invalidation
requests can be of any granularity. Per the ordering requirements
described in Section 7.8, older page group responses issued by software
to the endpoint device before step (a) are guaranteed to be received by
the endpoint before the endpoint receives this Device-TLB invalidation
request.
"
The purpose of the cache invalidation requests sent to the device is to
leverage PCI ordering requirements to ensure that all page group
responses are received by the device before it processes the TLB
invalidation request. Therefore, the specification doesn't mandate the
type and granularity of invalidation requests, as long as they are
valid.
Given that the Page Request Interface (PRI) is only supported when the
IOMMU operates in scalable mode, I don't believe we need to make any
changes to the invalidation types at this time.
--
baolu
Powered by blists - more mailing lists