linux-kernel - RE: [PATCH] iommu/io-pgtable-arm: Optimize partial walk flush for large scatter-gather list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <BY5PR12MB37646698F37C00381EFF7C77B3349@BY5PR12MB3764.namprd12.prod.outlook.com>
Date:   Fri, 11 Jun 2021 00:37:24 +0000
From:   Krishna Reddy <vdumpa@...dia.com>
To:     Robin Murphy <robin.murphy@....com>,
        Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>
CC:     "linux-arm-msm@...r.kernel.org" <linux-arm-msm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        Will Deacon <will@...nel.org>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        Thierry Reding <treding@...dia.com>
Subject: RE: [PATCH] iommu/io-pgtable-arm: Optimize partial walk flush for
 large scatter-gather list

> > No, the unmap latency is not just in some test case written, the issue
> > is very real and we have workloads where camera is reporting frame
> > drops because of this unmap latency in the order of 100s of milliseconds.
> > And hardware team recommends using ASID based invalidations for
> > anything larger than 128 TLB entries. So yes, we have taken note of
> > impacts here before going this way and hence feel more inclined to
> > make this qcom specific if required.

Seems like the real issue here is not the unmap API latency.
It should be the high number of back to back SMMU TLB invalidate register writes that is resulting
in lower ISO BW to Camera and overflow. Isn't it?
Even Tegra186 SoC has similar issue and HW team recommended to rate limit the number of
back to back SMMU tlb invalidate registers writes. The subsequent Tegra194 SoC has a dedicated SMMU for
ISO clients to avoid the impact of TLB invalidates from NISO clients on ISO BW.

>> Thinking some more, I
>> wonder if the Tegra folks might have an opinion to add here, given 
>> that their multiple-SMMU solution was seemingly about trying to get 
>> enough TLB and pagetable walk bandwidth in the first place?

While it is good to reduce the number of tlb register writes, Flushing all TLB entries at context granularity arbitrarily
can have negative impact on active traffic and BW. I don't have much data on possible impact at this point.
Can the flushing at context granularity be made a quirk than performing it as default? 

-KR