lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZUD0lhStirf8IN8-@hovoldconsulting.com>
Date:   Tue, 31 Oct 2023 13:35:34 +0100
From:   Johan Hovold <johan@...nel.org>
To:     Bjorn Andersson <quic_bjorande@...cinc.com>
Cc:     Rob Clark <robdclark@...il.com>,
        Abhinav Kumar <quic_abhinavk@...cinc.com>,
        Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
        Sean Paul <sean@...rly.run>,
        Marijn Suijten <marijn.suijten@...ainline.org>,
        David Airlie <airlied@...il.com>,
        Daniel Vetter <daniel@...ll.ch>,
        Bjorn Andersson <andersson@...nel.org>,
        Kuogee Hsieh <quic_khsieh@...cinc.com>,
        linux-arm-msm@...r.kernel.org, dri-devel@...ts.freedesktop.org,
        freedreno@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org, Doug Anderson <dianders@...omium.org>,
        Rob Clark <robdclark@...omium.org>
Subject: Re: [PATCH] drm/msm/dpu: Add missing safe_lut_tbl in sc8280xp catalog

On Mon, Oct 30, 2023 at 04:23:20PM -0700, Bjorn Andersson wrote:
> During USB transfers on the SC8280XP __arm_smmu_tlb_sync() is seen to
> typically take 1-2ms to complete. As expected this results in poor
> performance, something that has been mitigated by proposing running the
> iommu in non-strict mode (boot with iommu.strict=0).
> 
> This turns out to be related to the SAFE logic, and programming the QOS
> SAFE values in the DPU (per suggestion from Rob and Doug) reduces the
> TLB sync time to below 10us, which means significant less time spent
> with interrupts disabled and a significant boost in throughput.

I ran some tests with a gigabit ethernet adapter to get an idea of how
this performs in comparison to using lazy iommu mode ("non-strict"):

		6.6	6.6-lazy	6.6-dpu		6.6-dpu-lazy
iperf3 recv	114	941		941		941		MBit/s
iperf3 send	124	891		703		940		MBit/s

scp recv	14.6	110		110		111		MB/s
scp send	12.5	98.9		91.5		110		MB/s

This patch in itself indeed improves things quite a bit, but there is
still some performance that can be gained by using lazy iommu mode.

Notably, lazy mode with this patch applied appears to saturate the link
in both directions.

Tested-by: Johan Hovold <johan+linaro@...nel.org>

Johan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ