lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 4 Feb 2023 14:32:08 +0800
From:   Baolu Lu <baolu.lu@...ux.intel.com>
To:     Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>, iommu@...ts.linux.dev,
        Joerg Roedel <joro@...tes.org>
Cc:     baolu.lu@...ux.intel.com, David Woodhouse <dwmw2@...radead.org>,
        Raj Ashok <ashok.raj@...el.com>,
        "Tian, Kevin" <kevin.tian@...el.com>, Yi Liu <yi.l.liu@...el.com>,
        stable@...r.kernel.org, Sanjay Kumar <sanjay.k.kumar@...el.com>,
        Robin Murphy <robin.murphy@....com>
Subject: Re: [PATCH] iommu/vt-d: Avoid superfluous IOTLB tracking in lazy mode

On 2023/2/4 7:04, Jacob Pan wrote:
> Intel IOMMU driver implements IOTLB flush queue with domain selective
> or PASID selective invalidations. In this case there's no need to track
> IOVA page range and sync IOTLBs, which may cause significant performance
> hit.

[Add cc Robin]

If I understand this patch correctly, this might be caused by below
helper:

/**
  * iommu_iotlb_gather_add_page - Gather for page-based TLB invalidation
  * @domain: IOMMU domain to be invalidated
  * @gather: TLB gather data
  * @iova: start of page to invalidate
  * @size: size of page to invalidate
  *
  * Helper for IOMMU drivers to build invalidation commands based on 
individual
  * pages, or with page size/table level hints which cannot be gathered 
if they
  * differ.
  */
static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
                                                struct 
iommu_iotlb_gather *gather,
                                                unsigned long iova, 
size_t size)
{
         /*
          * If the new page is disjoint from the current range or is 
mapped at
          * a different granularity, then sync the TLB so that the gather
          * structure can be rewritten.
          */
         if ((gather->pgsize && gather->pgsize != size) ||
             iommu_iotlb_gather_is_disjoint(gather, iova, size))
                 iommu_iotlb_sync(domain, gather);

         gather->pgsize = size;
         iommu_iotlb_gather_add_range(gather, iova, size);
}

As the comments for iommu_iotlb_gather_is_disjoint() says,

"...For many IOMMUs, flushing the IOMMU in this case is better
  than merging the two, which might lead to unnecessary invalidations.
  ..."

So, perhaps the right fix for this performance issue is to add

	if (!gather->queued)

in iommu_iotlb_gather_add_page() or iommu_iotlb_gather_is_disjoint()?
It should benefit other arch's as well.

> 
> This patch adds a check to avoid IOVA gather page and IOTLB sync for
> the lazy path.
> 
> The performance difference on Sapphire Rapids 100Gb NIC is improved by
> the following (as measured by iperf send):

Which test case have you done? Post the real data if you have any.

> 
> w/o this fix~48 Gbits/s. with this fix ~54 Gbits/s
> 
> Cc: <stable@...r.kernel.org>

Again, add a Fixes tag so that people know how far this fix should be
back ported.

> Tested-by: Sanjay Kumar <sanjay.k.kumar@...el.com>
> Signed-off-by: Sanjay Kumar <sanjay.k.kumar@...el.com>
> Signed-off-by: Jacob Pan <jacob.jun.pan@...ux.intel.com>

Best regards,
baolu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ