lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231228001646.587653-6-haifeng.zhao@linux.intel.com>
Date: Wed, 27 Dec 2023 19:16:46 -0500
From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
To: bhelgaas@...gle.com,
	baolu.lu@...ux.intel.com,
	dwmw2@...radead.org,
	will@...nel.org,
	robin.murphy@....com,
	lukas@...ner.de
Cc: linux-pci@...r.kernel.org,
	iommu@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: [RFC PATCH v9 5/5] iommu/vt-d: don't loop for timeout ATS Invalidation request forever

When the ATS Invalidation request timeout happens, the qi_submit_sync()
will restart and loop for the invalidation request forever till it is
done, it will block another Invalidation thread such as the fq_timer
to issue invalidation request, cause the system lockup as following

[exception RIP: native_queued_spin_lock_slowpath+92]

RIP: ffffffffa9d1025c RSP: ffffb202f268cdc8 RFLAGS: 00000002

RAX: 0000000000000101 RBX: ffffffffab36c2a0 RCX: 0000000000000000

RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffab36c2a0

RBP: ffffffffab36c2a0 R8: 0000000000000001 R9: 0000000000000000

R10: 0000000000000010 R11: 0000000000000018 R12: 0000000000000000

R13: 0000000000000004 R14: ffff9e10d71b1c88 R15: ffff9e10d71b1980

ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
                                                                          
#12 [ffffb202f268cdc8] native_queued_spin_lock_slowpath at ffffffffa9d1025c
                                                                           
#13 [ffffb202f268cdc8] do_raw_spin_lock at ffffffffa9d121f1                
                                                                           
#14 [ffffb202f268cdd8] _raw_spin_lock_irqsave at ffffffffaa51795b          
                                                                           
#15 [ffffb202f268cdf8] iommu_flush_dev_iotlb at ffffffffaa20df48           
                                                                           
#16 [ffffb202f268ce28] iommu_flush_iova at ffffffffaa20e182                
                                                                           
#17 [ffffb202f268ce60] iova_domain_flush at ffffffffaa220e27               
                                                                           
#18 [ffffb202f268ce70] fq_flush_timeout at ffffffffaa221c9d                
                                                                           
#19 [ffffb202f268cea8] call_timer_fn at ffffffffa9d46661                   
                                                                           
#20 [ffffb202f268cf08] run_timer_softirq at ffffffffa9d47933               
                                                                           
#21 [ffffb202f268cf98] __softirqentry_text_start at ffffffffaa8000e0      
                                                                         
#22 [ffffb202f268cff0] asm_call_sysvec_on_stack at ffffffffaa60114f 
--- ---
(the left part of exception see the hotplug case of ATS capable device)

If one endpoint device just no response to the ATS Invalidation request,
but is not gone, it will bring down the whole system, to avoid such 
case, don't try the timeout ATS Invalidation request forever.

Signed-off-by: Ethan Zhao <haifeng.zhao@...ux.intel.com>
---
 drivers/iommu/intel/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 76903a8bf963..206ab0b7294f 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1457,7 +1457,7 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
 	reclaim_free_desc(qi);
 	raw_spin_unlock_irqrestore(&qi->q_lock, flags);
 
-	if (rc == -EAGAIN)
+	if (rc == -EAGAIN && type !=QI_DIOTLB_TYPE && type != QI_DEIOTLB_TYPE)
 		goto restart;
 
 	if (iotlb_start_ktime)
-- 
2.31.1


Powered by blists - more mailing lists