[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2ada3a4f-3c2a-f46f-4e39-5c60912cc386@linux.intel.com>
Date: Fri, 10 Jun 2022 14:05:37 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: "Raj, Ashok" <ashok.raj@...el.com>
Cc: baolu.lu@...ux.intel.com, Joerg Roedel <joro@...tes.org>,
Jason Gunthorpe <jgg@...dia.com>,
Robin Murphy <robin.murphy@....com>,
Kevin Tian <kevin.tian@...el.com>,
Christoph Hellwig <hch@...radead.org>,
Will Deacon <will@...nel.org>,
Joao Martins <joao.m.martins@...cle.com>,
iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCHES 1/2] iommu: Add RCU-protected page free support
On 2022/6/10 01:06, Raj, Ashok wrote:
> On Thu, Jun 09, 2022 at 03:08:10PM +0800, Lu Baolu wrote:
>> The IOMMU page tables are updated using iommu_map/unmap() interfaces.
>> Currently, there is no mandatory requirement for drivers to use locks
>> to ensure concurrent updates to page tables, because it's assumed that
>> overlapping IOVA ranges do not have concurrent updates. Therefore the
>> IOMMU drivers only need to take care of concurrent updates to level
>> page table entries.
>
> The last part doesn't read well..
> s/updates to level page table entries/ updates to page-table entries at the
> same level
>
>>
>> But enabling new features challenges this assumption. For example, the
>> hardware assisted dirty page tracking feature requires scanning page
>> tables in interfaces other than mapping and unmapping. This might result
>> in a use-after-free scenario in which a level page table has been freed
>> by the unmap() interface, while another thread is scanning the next level
>> page table.
>>
>> This adds RCU-protected page free support so that the pages are really
>> freed and reused after a RCU grace period. Hence, the page tables are
>> safe for scanning within a rcu_read_lock critical region. Considering
>> that scanning the page table is a rare case, this also adds a domain
>> flag and the RCU-protected page free is only used when this flat is set.
>
> s/flat/flag
Above updated. Thank you!
>>
>> Signed-off-by: Lu Baolu <baolu.lu@...ux.intel.com>
>> ---
>> include/linux/iommu.h | 9 +++++++++
>> drivers/iommu/iommu.c | 23 +++++++++++++++++++++++
>> 2 files changed, 32 insertions(+)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 5e1afe169549..6f68eabb8567 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -95,6 +95,7 @@ struct iommu_domain {
>> void *handler_token;
>> struct iommu_domain_geometry geometry;
>> struct iommu_dma_cookie *iova_cookie;
>> + unsigned long concurrent_traversal:1;
>
> Does this need to be a bitfield? Even though you are needing just one bit
> now, you can probably make have maskbits?
>
As discussed in another reply, I am about to drop the driver opt-in and
wrapper the helper around the iommu_iotlb_gather.
>
>> };
>>
>> static inline bool iommu_is_dma_domain(struct iommu_domain *domain)
>> @@ -657,6 +658,12 @@ static inline void dev_iommu_priv_set(struct device *dev, void *priv)
>> dev->iommu->priv = priv;
>> }
>>
>> +static inline void domain_set_concurrent_traversal(struct iommu_domain *domain,
>> + bool value)
>> +{
>> + domain->concurrent_traversal = value;
>> +}
>> +
>> int iommu_probe_device(struct device *dev);
>> void iommu_release_device(struct device *dev);
>>
>> @@ -677,6 +684,8 @@ int iommu_group_claim_dma_owner(struct iommu_group *group, void *owner);
>> void iommu_group_release_dma_owner(struct iommu_group *group);
>> bool iommu_group_dma_owner_claimed(struct iommu_group *group);
>>
>> +void iommu_free_pgtbl_pages(struct iommu_domain *domain,
>> + struct list_head *pages);
>> #else /* CONFIG_IOMMU_API */
>>
>> struct iommu_ops {};
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 847ad47a2dfd..ceeb97ebe3e2 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -3252,3 +3252,26 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group)
>> return user;
>> }
>> EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
>> +
>> +static void pgtble_page_free_rcu(struct rcu_head *rcu)
>
> maybe the names can be consistent? pgtble_ vs pgtbl below.
>
> vote to drop the 'e' :-)
Updated.
>
>> +{
>> + struct page *page = container_of(rcu, struct page, rcu_head);
>> +
>> + __free_pages(page, 0);
>> +}
>> +
>> +void iommu_free_pgtbl_pages(struct iommu_domain *domain,
>> + struct list_head *pages)
>> +{
>> + struct page *page, *next;
>> +
>> + if (!domain->concurrent_traversal) {
>> + put_pages_list(pages);
>> + return;
>> + }
>> +
>> + list_for_each_entry_safe(page, next, pages, lru) {
>> + list_del(&page->lru);
>> + call_rcu(&page->rcu_head, pgtble_page_free_rcu);
>> + }
>> +}
>> --
>> 2.25.1
>>
Best regards,
baolu
Powered by blists - more mailing lists