lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Dec 2020 18:36:00 +0800
From:   Yong Wu <yong.wu@...iatek.com>
To:     Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
        Robin Murphy <robin.murphy@....com>
CC:     Matthias Brugger <matthias.bgg@...il.com>,
        Krzysztof Kozlowski <krzk@...nel.org>,
        Tomasz Figa <tfiga@...gle.com>,
        <linux-mediatek@...ts.infradead.org>,
        <srv_heupstream@...iatek.com>, <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <iommu@...ts.linux-foundation.org>, <yong.wu@...iatek.com>,
        <youlin.pei@...iatek.com>, Nicolas Boichat <drinkcat@...omium.org>,
        <anan.sun@...iatek.com>, <chao.hao@...iatek.com>,
        Greg Kroah-Hartman <gregkh@...gle.com>,
        <kernel-team@...roid.com>
Subject: [PATCH v3 0/7] MediaTek IOMMU improve tlb flush performance in map/unmap

This patchset is to improve tlb flushing performance in iommu_map/unmap
for MediaTek IOMMU.

For iommu_map, currently MediaTek IOMMU use IO_PGTABLE_QUIRK_TLBI_ON_MAP
to do tlb_flush for each a memory chunk. this is so unnecessary. we could
improve it by tlb flushing one time at the end of iommu_map.

For iommu_unmap, currently we have already improve this performance by
gather. But the current gather should take care its granule size. if the
granule size is different, it will do tlb flush and gather again. Our HW
don't care about granule size. thus I gather the range in our file.

After this patchset, we could achieve only tlb flushing once in iommu_map
and iommu_unmap.

Regardless of sg, for each a segment, I did a simple test:
  
  size = 20 * SZ_1M;
  /* the worst case, all are 4k mapping. */
  ret = iommu_map(domain, 0x5bb02000, 0x123f1000, size, IOMMU_READ);
  iommu_unmap(domain, 0x5bb02000, size);

This is the comparing time(unit is us):
              original-time  after-improve
   map-20M    59943           2347
   unmap-20M  264             36

This patchset also flush tlb once in the iommu_map_sg case.

patch [1/7][2/7][3/7] are for map while the others are for unmap.

This patchset base on:
  a) mt8192 iommu v5
https://lore.kernel.org/linux-iommu/20201209080102.26626-1-yong.wu@mediatek.com/T/#t
  b) iommu/io-pgtable: Remove tlb_flush_leaf
https://lore.kernel.org/linux-iommu/160744101816.3622130.16266834943434854326.b4-ty@kernel.org/T/#mc8fbc98bee8bca865d73c873275ab34fed1c25c7

change note:
v3: Refactor the unmap flow suggested by Robin.
     
v2: https://lore.kernel.org/linux-iommu/20201119061836.15238-1-yong.wu@mediatek.com/
    Refactor all the code.
    base on v5.10-rc1.

Yong Wu (7):
  iommu: Move iotlb_sync_map out from __iommu_map
  iommu: Add iova and size as parameters in iotlb_sync_map
  iommu/mediatek: Add iotlb_sync_map to sync whole the iova range
  iommu: Switch gather->end to unsigned long long
  iommu: Allow io_pgtable_tlb ops optional
  iommu/mediatek: Gather iova in iommu_unmap to achieve tlb sync once
  iommu/mediatek: Remove the tlb-ops for v7s

 drivers/iommu/iommu.c      | 24 +++++++++++++++-----
 drivers/iommu/mtk_iommu.c  | 45 +++++++++++++++-----------------------
 drivers/iommu/tegra-gart.c |  3 ++-
 include/linux/io-pgtable.h |  8 ++++---
 include/linux/iommu.h      |  8 ++++---
 5 files changed, 49 insertions(+), 39 deletions(-)

-- 
2.18.0


Powered by blists - more mailing lists