lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB527674172BBA9BDC49A004D08C7D2@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Tue, 30 Jan 2024 08:43:58 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Ethan Zhao <haifeng.zhao@...ux.intel.com>, "Liu, Yi L"
	<yi.l.liu@...el.com>, "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
	"bhelgaas@...gle.com" <bhelgaas@...gle.com>, "robin.murphy@....com"
	<robin.murphy@....com>, "jgg@...pe.ca" <jgg@...pe.ca>
CC: "dwmw2@...radead.org" <dwmw2@...radead.org>, "will@...nel.org"
	<will@...nel.org>, "lukas@...ner.de" <lukas@...ner.de>,
	"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>
Subject: RE: [PATCH v12 5/5] iommu/vt-d: improve ITE fault handling if target
 device isn't present

> From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
> Sent: Tuesday, January 30, 2024 4:16 PM
> 
> On 1/30/2024 2:22 PM, Tian, Kevin wrote:
> > Here we need consider two situations.
> >
> > One is that the device is not bound to a driver or bound to a driver
> > which doesn't do active work to the device when it's removed. In
> > that case one may observe the timeout situation only in the removal
> > path as the stack dump in your patch02 shows.
> 
> When iommu_bus_notifier() got called for hotplug removal cases to
> flush devTLB (ATS invalidation), driver was already unloaded.
> whatever safe removal or surprise removal. so in theory no active
> driver working there.
> 
> pciehp_ist()
>   pciehp_disable_slot()
>    remove_board()
>     pciehp_unconfigure_device()
>      pci_stop_and_remove_bus_device()
>       pci_stop_bus_device()--->here unload driver
>       pci_remove_bus_device()->here qi_flush_dev_iotlb() got called.

yes, so patch02 can fix this case.

> 
> >
> > patch02 can fix that case by checking whether the device is present
> > to skip sending the invalidation requests. So the logic being discussed
> > here doesn't matter.
> >
> > The 2nd situation is more tricky. The device might be bound to
> > a driver which is doing active work to the device with in-fly
> > ATS invalidation requests. In this case in-fly requests must be aborted
> > before the driver can be detached from the removed device. Conceptually
> > a device is removed from the bus only after its driver is detached.
> 
> Some tricky situations:
> 
> 1. The ATS invalidation request is issued from driver driver, while it is
> in handling, device is removed. this momment, the device instance still
> exists in the bus list. yes, if searching it by BDF, could get it.

it's searchable between the point where the device is removed and the
point where the driver is unloaded:

        CPU0                                CPU1
  (Driver is active)                    (pciehp handler)
  qi_submit_sync()                      pciehp_ist()
    ...                                   ...
    loop for completion() {               pciehp_unconfigure_device()
      ...                                   pci_dev_set_disconnected()
      if (ITE) {                            ...
        //find pci_dev from sid             pci_remove_bus_device()
        if (pci_dev_is_connected())           device_del()
          break;                                bus_remove_device()
      }                                           device_remove_driver()
      ..                                            //wait for driver unload
    }                                               
    ..
    return;

                                                  BUS_NOTIFY_REMOVED_DEVICE;
                                              list_del(&dev->bus_list);

(I didn’t draw the full calling stack on the right hand side)

> 
> 2. The ATS invalidation request is issued from iommu_bus_notifier()
> for surprise removal reason, as shown in above calltrace, device was
> already removed from bus list. if searching it by BDF, return NULL.
> 
> 3. The ATS invlidation request is issued from iommu_bus_notifier()
> for safe removal, when is in handling, device is removed or link
> is down. also as #2, device was already removed from bus list.
> if searching it by BDF. got NULL.
> ...
> 
> so, searching device by BDF, only works for the ATS invalidation
> request is from device driver.
> 

anything related to bus notifier has been fixed by patch02. 

the remaining logic is really for fixing the race invalidation from
device driver. 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ