linux-kernel - [PATCH v11 0/9] iommu/amd: Use 128-bit cmpxchg operation to update DTE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241114194436.389961-1-suravee.suthikulpanit@amd.com>
Date: Thu, 14 Nov 2024 19:44:27 +0000
From: Suravee Suthikulpanit <suravee.suthikulpanit@....com>
To: <linux-kernel@...r.kernel.org>, <iommu@...ts.linux.dev>
CC: <joro@...tes.org>, <robin.murphy@....com>, <vasant.hegde@....com>,
	<arnd@...db.de>, <ubizjak@...il.com>, <linux-arch@...r.kernel.org>,
	<jgg@...dia.com>, <kevin.tian@...el.com>, <jon.grimm@....com>,
	<santosh.shukla@....com>, <pandoh@...gle.com>, <kumaranand@...gle.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@....com>
Subject: [PATCH v11 0/9] iommu/amd: Use 128-bit cmpxchg operation to update DTE

This series modifies current implementation to use 128-bit cmpxchg to
update DTE when needed as specified in the AMD I/O Virtualization
Techonology (IOMMU) Specification.

Please note that I have verified with the hardware designer, and they have
confirmed that the IOMMU hardware has always been implemented with 256-bit
read. The next revision of the IOMMU spec will be updated to correctly
describe this part.  Therefore, I have updated the implementation to avoid
unnecessary flushing.

Also, this has been a long series. I would like to thank several folks who
have helped review and provide suggestions along the way :)

Changes in v11:
* Remove the patch to introduce __READ_ONCE() for 128-bit data type
  since all 128-bit DTE access is currently done under per-DTE spin_lock.
  This is to help avoid complicating __unqual_scalar_typeof() further (Per Arnd).

* Patch 4, 6:
  -  Replace spin_lock/unlock() with spin_lock_irqsave/spin_unlock_irqrestore()
     due to the following dmesg warning:

    =====================================================
    WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
    6.12.0-rc5+ #29 Not tainted
    -----------------------------------------------------
    cc1/145047 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
    ffff93737e255cb8 (&dev_data->dte_lock){+.+.}-{2:2}, at: update_dte256+0x5f/0x1b0
   
    and this task is already holding:
    ffff937371782150 (&domain->lock){-.-.}-{2:2}, at: alloc_pte.constprop.0+0x175/0x5e0
    which would create a new lock dependency:
     (&domain->lock){-.-.}-{2:2} -> (&dev_data->dte_lock){+.+.}-{2:2}
   
    but this new dependency connects a HARDIRQ-irq-safe lock:
     (&domain->lock){-.-.}-{2:2}
   
    ... which became HARDIRQ-irq-safe at:
      __lock_acquire+0x399/0xbb0
      lock_acquire.part.0+0xb0/0x250
      __raw_spin_lock_irqsave+0x49/0x90
      amd_iommu_flush_iotlb_all+0x1b/0x50
      fq_flush_iotlb+0x22/0x40
      queue_iova+0x12d/0x150
      __iommu_dma_unmap+0xc2/0x140
      iommu_dma_unmap_page+0x44/0x90
      dma_unmap_page_attrs+0x202/0x240
      nvme_pci_complete_batch+0xb3/0xd0 [nvme]
      nvme_irq+0x7f/0x90 [nvme]
      __handle_irq_event_percpu+0x81/0x270
      handle_irq_event+0x34/0x70
      handle_edge_irq+0x9f/0x240
      __common_interrupt+0x70/0x140
      common_interrupt+0xb2/0xd0
      asm_common_interrupt+0x22/0x40
      cpuidle_enter_state+0x11d/0x540
      cpuidle_enter+0x29/0x40
      cpuidle_idle_call+0x100/0x170
      do_idle+0x96/0xf0
      cpu_startup_entry+0x25/0x30
      start_secondary+0x11d/0x140
      common_startup_64+0x13e/0x141
   
    to a HARDIRQ-irq-unsafe lock:
     (&dev_data->dte_lock){+.+.}-{2:2}
   
    ... which became HARDIRQ-irq-unsafe at:
    ...
      __lock_acquire+0x399/0xbb0
      lock_acquire.part.0+0xb0/0x250
      _raw_spin_lock+0x34/0x80
      update_dte256+0x5f/0x1b0
      set_dte_entry+0x1d1/0x290
      dev_update_dte+0x53/0x120
      attach_device.isra.0+0x120/0x4f0
      amd_iommu_attach_device+0x83/0xd0
      __iommu_attach_device+0x1d/0xd0
      __iommu_device_set_domain+0x5b/0xb0
      __iommu_group_set_domain_internal+0x68/0x120
      iommu_setup_default_domain+0x204/0x350
      iommu_device_register+0x156/0x250
      iommu_init_pci+0x18f/0x570
      amd_iommu_init_pci+0xcb/0x2b0
      state_next+0x7e5/0x8e0
      amd_iommu_init+0x1f/0x80
      pci_iommu_init+0xe/0x40
      do_one_initcall+0x5f/0x2c0
      do_initcalls+0xb9/0x180
      kernel_init_freeable+0x149/0x230
      kernel_init+0x16/0x1c0
      ret_from_fork+0x2d/0x50
      ret_from_fork_asm+0x1a/0x30

* Patch 4:
  - Introduce helper function iommu_atomic128_set() (Per Uros)
  - In write_dte_update128(), remove unnecessary do-while() loop for
    try-cmpxchg and remove __READ_ONCE() since the function is called
    under DTE spin_lock;

* Patch 6: In get_dte256(), remove __READ_ONCE(), since the 128-bit data read
           is inside DTE spin_lock.

* Patch 8: Remove READ/WRITE_ONCE() from amd_iommu_set_dirty_tracking()
           since called inside DTE spin_lock.

v10: https://lore.kernel.org/lkml/20241113120327.5239-1-suravee.suthikulpanit@amd.com/
v9: https://lore.kernel.org/lkml/20241101162304.4688-1-suravee.suthikulpanit@amd.com/
v8: https://lore.kernel.org/lkml/20241031184243.4184-1-suravee.suthikulpanit@amd.com/
v7: https://lore.kernel.org/lkml/20241031091624.4895-1-suravee.suthikulpanit@amd.com/
v6: https://lore.kernel.org/lkml/20241016051756.4317-1-suravee.suthikulpanit@amd.com/
v5: https://lore.kernel.org/lkml/20241007041353.4756-1-suravee.suthikulpanit@amd.com/
v4: https://lore.kernel.org/lkml/20240916171805.324292-1-suravee.suthikulpanit@amd.com/
v3: https://lore.kernel.org/lkml/20240906121308.5013-1-suravee.suthikulpanit@amd.com/
v2: https://lore.kernel.org/lkml/20240829180726.5022-1-suravee.suthikulpanit@amd.com/
v1: https://lore.kernel.org/lkml/20240819161839.4657-1-suravee.suthikulpanit@amd.com/

Thanks,
Suravee

Suravee Suthikulpanit (9):
  iommu/amd: Misc ACPI IVRS debug info clean up
  iommu/amd: Disable AMD IOMMU if CMPXCHG16B feature is not supported
  iommu/amd: Introduce struct ivhd_dte_flags to store persistent DTE
    flags
  iommu/amd: Introduce helper function to update 256-bit DTE
  iommu/amd: Modify set_dte_entry() to use 256-bit DTE helpers
  iommu/amd: Introduce helper function get_dte256()
  iommu/amd: Modify clear_dte_entry() to avoid in-place update
  iommu/amd: Lock DTE before updating the entry with WRITE_ONCE()
  iommu/amd: Remove amd_iommu_apply_erratum_63()

 drivers/iommu/amd/amd_iommu.h       |   4 +-
 drivers/iommu/amd/amd_iommu_types.h |  41 ++-
 drivers/iommu/amd/init.c            | 229 +++++++++--------
 drivers/iommu/amd/iommu.c           | 378 +++++++++++++++++++++-------
 4 files changed, 440 insertions(+), 212 deletions(-)

-- 
2.34.1