lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <42623a73-c288-1c0d-7021-93caff4ffb6f@linux.intel.com>
Date:   Tue, 31 May 2022 11:02:06 +0800
From:   Baolu Lu <baolu.lu@...ux.intel.com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     baolu.lu@...ux.intel.com, Joerg Roedel <joro@...tes.org>,
        Kevin Tian <kevin.tian@...el.com>,
        Ashok Raj <ashok.raj@...el.com>,
        Christoph Hellwig <hch@...radead.org>,
        Will Deacon <will@...nel.org>,
        Robin Murphy <robin.murphy@....com>,
        Liu Yi L <yi.l.liu@...el.com>,
        Jacob jun Pan <jacob.jun.pan@...el.com>,
        iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in
 debugfs

On 2022/5/30 20:14, Jason Gunthorpe wrote:
> On Sun, May 29, 2022 at 01:14:46PM +0800, Baolu Lu wrote:
> 
>>  From 1e87b5df40c6ce9414cdd03988c3b52bfb17af5f Mon Sep 17 00:00:00 2001
>> From: Lu Baolu <baolu.lu@...ux.intel.com>
>> Date: Sun, 29 May 2022 10:18:56 +0800
>> Subject: [PATCH 1/1] iommu/vt-d: debugfs: Remove device_domain_lock usage
>>
>> The domain_translation_struct debugfs node is used to dump static
>> mappings of PCI devices. It potentially races with setting new
>> domains to devices and the iommu_map/unmap() interfaces. The existing
>> code tries to use the global spinlock device_domain_lock to avoid the
>> races, but this is problematical as this lock is only used to protect
>> the device tracking lists of the domains.
>>
>> Instead of using an immature lock to cover up the problem, it's better
>> to explicitly restrict the use of this debugfs node. This also makes
>> device_domain_lock static.
> 
> What does "explicitly restrict" mean?

I originally thought about adding restrictions on this interface to a
document. But obviously, this is a naive idea. :-) I went over the code
again. The races exist in two paths:

1. Dump the page table in use while setting a new page table to the
    device.
2. A high-level page table entry has been marked as non-present, but the
    dumping code has walked down to the low-level tables.

For case 1, we can try to solve it by dumping tables while holding the
group->mutex.

For case 2, it is a bit weird. I tried to add a rwsem lock to make the
iommu_unmap() and dumping tables in debugfs exclusive. This does not
work because debugfs may depend on the DMA of the devices to work. It
seems that what we can do is to allow this race, but when we traverse
the page table in debugfs, we will check the validity of the physical
address retrieved from the page table entry. Then, the worst case is to
print some useless information.

The real code looks like this:

 From 3feb0727f9d7095729ef75ab1967270045b3a38c Mon Sep 17 00:00:00 2001
From: Lu Baolu <baolu.lu@...ux.intel.com>
Date: Sun, 29 May 2022 10:18:56 +0800
Subject: [PATCH 1/1] iommu/vt-d: debugfs: Remove device_domain_lock usage

The domain_translation_struct debugfs node is used to dump the DMAR page
tables for the PCI devices. It potentially races with setting domains to
devices and the iommu_unmap() interface. The existing code uses a global
spinlock device_domain_lock to avoid the races, but this is problematical
as this lock is only used to protect the device tracking lists of each
domain.

This replaces device_domain_lock with group->mutex to protect the traverse
of the page tables from setting a new domain and always check the physical
address retrieved from the page table entry before traversing to the next-
level page table.

As a cleanup, this also makes device_domain_lock static.

Signed-off-by: Lu Baolu <baolu.lu@...ux.intel.com>
---
  drivers/iommu/intel/debugfs.c | 42 ++++++++++++++++++++++-------------
  drivers/iommu/intel/iommu.c   |  2 +-
  drivers/iommu/intel/iommu.h   |  1 -
  3 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index d927ef10641b..e6f4835b8d9f 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -333,25 +333,28 @@ static void pgtable_walk_level(struct seq_file *m, 
struct dma_pte *pde,
  			continue;

  		path[level] = pde->val;
-		if (dma_pte_superpage(pde) || level == 1)
+		if (dma_pte_superpage(pde) || level == 1) {
  			dump_page_info(m, start, path);
-		else
-			pgtable_walk_level(m, phys_to_virt(dma_pte_addr(pde)),
+		} else {
+			unsigned long phys_addr;
+
+			phys_addr = (unsigned long)dma_pte_addr(pde);
+			if (!pfn_valid(__phys_to_pfn(phys_addr)))
+				break;
+			pgtable_walk_level(m, phys_to_virt(phys_addr),
  					   level - 1, start, path);
+		}
  		path[level] = 0;
  	}
  }

-static int show_device_domain_translation(struct device *dev, void *data)
+static int __show_device_domain_translation(struct device *dev, void *data)
  {
  	struct device_domain_info *info = dev_iommu_priv_get(dev);
  	struct dmar_domain *domain = info->domain;
  	struct seq_file *m = data;
  	u64 path[6] = { 0 };

-	if (!domain)
-		return 0;
-
  	seq_printf(m, "Device %s @0x%llx\n", dev_name(dev),
  		   (u64)virt_to_phys(domain->pgd));
  	seq_puts(m, 
"IOVA_PFN\t\tPML5E\t\t\tPML4E\t\t\tPDPE\t\t\tPDE\t\t\tPTE\n");
@@ -359,20 +362,27 @@ static int show_device_domain_translation(struct 
device *dev, void *data)
  	pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
  	seq_putc(m, '\n');

-	return 0;
+	return 1;
  }

-static int domain_translation_struct_show(struct seq_file *m, void *unused)
+static int show_device_domain_translation(struct device *dev, void *data)
  {
-	unsigned long flags;
-	int ret;
+	struct iommu_group *group;

-	spin_lock_irqsave(&device_domain_lock, flags);
-	ret = bus_for_each_dev(&pci_bus_type, NULL, m,
-			       show_device_domain_translation);
-	spin_unlock_irqrestore(&device_domain_lock, flags);
+	group = iommu_group_get(dev);
+	if (group) {
+		iommu_group_for_each_dev(group, data,
+					 __show_device_domain_translation);
+		iommu_group_put(group);
+	}

-	return ret;
+	return 0;
+}
+
+static int domain_translation_struct_show(struct seq_file *m, void *unused)
+{
+	return bus_for_each_dev(&pci_bus_type, NULL, m,
+				show_device_domain_translation);
  }
  DEFINE_SHOW_ATTRIBUTE(domain_translation_struct);

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 1af4b6562266..cacae8bdaa65 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -314,7 +314,7 @@ static int iommu_skip_te_disable;
  #define IDENTMAP_GFX		2
  #define IDENTMAP_AZALIA		4

-DEFINE_SPINLOCK(device_domain_lock);
+static DEFINE_SPINLOCK(device_domain_lock);
  static LIST_HEAD(device_domain_list);

  /*
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index a22adfbdf870..8a6d64d726c0 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -480,7 +480,6 @@ enum {
  #define VTD_FLAG_SVM_CAPABLE		(1 << 2)

  extern int intel_iommu_sm;
-extern spinlock_t device_domain_lock;

  #define sm_supported(iommu)	(intel_iommu_sm && ecap_smts((iommu)->ecap))
  #define pasid_supported(iommu)	(sm_supported(iommu) &&

Best regards,
baolu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ