lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 6 Feb 2023 03:48:52 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Baolu Lu <baolu.lu@...ux.intel.com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>,
        "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
        Joerg Roedel <joro@...tes.org>
CC:     David Woodhouse <dwmw2@...radead.org>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>,
        "Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
        "Robin Murphy" <robin.murphy@....com>
Subject: RE: [PATCH] iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode

> From: Baolu Lu <baolu.lu@...ux.intel.com>
> Sent: Saturday, February 4, 2023 2:32 PM
> 
> On 2023/2/4 7:04, Jacob Pan wrote:
> > Intel IOMMU driver implements IOTLB flush queue with domain selective
> > or PASID selective invalidations. In this case there's no need to track
> > IOVA page range and sync IOTLBs, which may cause significant
> performance
> > hit.
> 
> [Add cc Robin]
> 
> If I understand this patch correctly, this might be caused by below
> helper:
> 
> /**
>   * iommu_iotlb_gather_add_page - Gather for page-based TLB invalidation
>   * @domain: IOMMU domain to be invalidated
>   * @gather: TLB gather data
>   * @iova: start of page to invalidate
>   * @size: size of page to invalidate
>   *
>   * Helper for IOMMU drivers to build invalidation commands based on
> individual
>   * pages, or with page size/table level hints which cannot be gathered
> if they
>   * differ.
>   */
> static inline void iommu_iotlb_gather_add_page(struct iommu_domain
> *domain,
>                                                 struct
> iommu_iotlb_gather *gather,
>                                                 unsigned long iova,
> size_t size)
> {
>          /*
>           * If the new page is disjoint from the current range or is
> mapped at
>           * a different granularity, then sync the TLB so that the gather
>           * structure can be rewritten.
>           */
>          if ((gather->pgsize && gather->pgsize != size) ||
>              iommu_iotlb_gather_is_disjoint(gather, iova, size))
>                  iommu_iotlb_sync(domain, gather);
> 
>          gather->pgsize = size;
>          iommu_iotlb_gather_add_range(gather, iova, size);
> }
> 
> As the comments for iommu_iotlb_gather_is_disjoint() says,
> 
> "...For many IOMMUs, flushing the IOMMU in this case is better
>   than merging the two, which might lead to unnecessary invalidations.
>   ..."
> 
> So, perhaps the right fix for this performance issue is to add
> 
> 	if (!gather->queued)
> 
> in iommu_iotlb_gather_add_page() or iommu_iotlb_gather_is_disjoint()?
> It should benefit other arch's as well.
> 

There are only two callers of this helper: intel and arm-smmu-v3.

Looks other drivers just implements direct flush via io_pgtable_tlb_add_page().

and their unmap callback typically does:

if (!iommu_iotlb_gather_queued(gather))
	io_pgtable_tlb_add_page();

from this angle it's same policy as Jacob's does, i.e. if it's already
queued then no need to further call optimization for direct flush.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ