[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF6AEGs9PLiCZdJ-g42-bE6f9yMR6cMyKRdWOY5m799vF9o4SQ@mail.gmail.com>
Date: Tue, 31 Oct 2023 05:47:50 -0700
From: Rob Clark <robdclark@...il.com>
To: Johan Hovold <johan@...nel.org>
Cc: Bjorn Andersson <quic_bjorande@...cinc.com>,
Abhinav Kumar <quic_abhinavk@...cinc.com>,
Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
Sean Paul <sean@...rly.run>,
Marijn Suijten <marijn.suijten@...ainline.org>,
David Airlie <airlied@...il.com>,
Daniel Vetter <daniel@...ll.ch>,
Bjorn Andersson <andersson@...nel.org>,
Kuogee Hsieh <quic_khsieh@...cinc.com>,
linux-arm-msm@...r.kernel.org, dri-devel@...ts.freedesktop.org,
freedreno@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
stable@...r.kernel.org, Doug Anderson <dianders@...omium.org>,
Rob Clark <robdclark@...omium.org>
Subject: Re: [PATCH] drm/msm/dpu: Add missing safe_lut_tbl in sc8280xp catalog
On Tue, Oct 31, 2023 at 5:35 AM Johan Hovold <johan@...nel.org> wrote:
>
> On Mon, Oct 30, 2023 at 04:23:20PM -0700, Bjorn Andersson wrote:
> > During USB transfers on the SC8280XP __arm_smmu_tlb_sync() is seen to
> > typically take 1-2ms to complete. As expected this results in poor
> > performance, something that has been mitigated by proposing running the
> > iommu in non-strict mode (boot with iommu.strict=0).
> >
> > This turns out to be related to the SAFE logic, and programming the QOS
> > SAFE values in the DPU (per suggestion from Rob and Doug) reduces the
> > TLB sync time to below 10us, which means significant less time spent
> > with interrupts disabled and a significant boost in throughput.
>
> I ran some tests with a gigabit ethernet adapter to get an idea of how
> this performs in comparison to using lazy iommu mode ("non-strict"):
>
> 6.6 6.6-lazy 6.6-dpu 6.6-dpu-lazy
> iperf3 recv 114 941 941 941 MBit/s
> iperf3 send 124 891 703 940 MBit/s
>
> scp recv 14.6 110 110 111 MB/s
> scp send 12.5 98.9 91.5 110 MB/s
>
> This patch in itself indeed improves things quite a bit, but there is
> still some performance that can be gained by using lazy iommu mode.
>
> Notably, lazy mode with this patch applied appears to saturate the link
> in both directions.
Maybe there is still room for SoC specific udev rules so dma masters
without firmware can be configured as "lazy", ie. like:
https://chromium.googlesource.com/chromiumos/overlays/board-overlays/+/refs/heads/main/baseboard-trogdor/chromeos-base/chromeos-bsp-baseboard-trogdor/files/98-qcom-nonstrict-iommu.rules
BR,
-R
> Tested-by: Johan Hovold <johan+linaro@...nel.org>
>
> Johan
Powered by blists - more mailing lists