lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZHZIVJFd-HU_AO2F@gerhold.net>
Date:   Tue, 30 May 2023 21:02:44 +0200
From:   Stephan Gerhold <stephan@...hold.net>
To:     Konrad Dybcio <konrad.dybcio@...aro.org>
Cc:     Andy Gross <agross@...nel.org>,
        Bjorn Andersson <andersson@...nel.org>,
        Michael Turquette <mturquette@...libre.com>,
        Stephen Boyd <sboyd@...nel.org>,
        Georgi Djakov <djakov@...nel.org>,
        Leo Yan <leo.yan@...aro.org>,
        Evan Green <evgreen@...omium.org>,
        Marijn Suijten <marijn.suijten@...ainline.org>,
        linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-clk@...r.kernel.org, linux-pm@...r.kernel.org
Subject: Re: [PATCH 20/20] interconnect: qcom: Divide clk rate by src node
 bus width

On Tue, May 30, 2023 at 06:32:04PM +0200, Konrad Dybcio wrote:
> On 30.05.2023 12:20, Konrad Dybcio wrote:
> > Ever since the introduction of SMD RPM ICC, we've been dividing the
> > clock rate by the wrong bus width. This has resulted in:
> > 
> > - setting wrong (mostly too low) rates, affecting performance
> >   - most often /2 or /4
> >   - things like DDR never hit their full potential
> >   - the rates were only correct if src bus width == dst bus width
> >     for all src, dst pairs on a given bus
> > 
> > - Qualcomm using the same wrong logic in their BSP driver in msm-5.x
> >   that ships in production devices today
> > 
> > - me losing my sanity trying to find this
> > 
> > Resolve it by using dst_qn, if it exists.
> > 
> > Fixes: 5e4e6c4d3ae0 ("interconnect: qcom: Add QCS404 interconnect provider driver")
> > Signed-off-by: Konrad Dybcio <konrad.dybcio@...aro.org>
> > ---
> The problem is deeper.
> 
> Chatting with Stephan (+CC), we tackled a few issues (that I will send
> fixes for in v2):
> 
> 1. qcom_icc_rpm_set() should take per-node (src_qn->sum_avg, dst_qn->sum_avg)
>    and NOT aggregated bw (unless you want ALL of your nodes on a given provider
>    to "go very fast")
> 
> 2. the aggregate bw/clk rate calculation should use the node-specific bus widths
>    and not only the bus width of the src/dst node, otherwise the average bw
>    values will be utterly meaningless
> 

The peak bandwidth / clock rate is wrong as well if you have two paths
with different buswidths on the same bus/NoC. (If someone is interested
in details I can post my specific example I had in the chat, it shows
this more clearly.)

> 3. thanks to (1) and (2) qcom_icc_bus_aggregate() can be remodeled to instead
>    calculate the clock rates for the two rpm contexts, which we can then max()
>    and pass on to the ratesetting call
> 

Sounds good.

> 
> ----8<---- Cutting off Stephan's seal of approval, this is my thinking ----
> 
> 4. I *think* Qualcomm really made a mistake in their msm-5.4 driver where they
>    took most of the logic from the current -next state and should have been
>    setting the rate based on the *DST* provider, or at least that's my
>    understanding trying to read the "known good" msm-4.19 driver
>    (which remembers msm-3.0 lol).. Or maybe we should keep src but ensure there's
>    also a final (dst, dst) vote cast:
> 
> provider->inter_set = false // current state upstream
> 
> setting apps_proc<->slv_bimc_snoc
> setting mas_bimc_snoc<->slv_snoc_cnoc
> setting mas_snoc_cnoc<->qhs_sdc2
> 
> 
> provider->inter_set = true // I don't think there's effectively a difference?
> 
> setting apps_proc<->slv_bimc_snoc
> setting slv_bimc_snoc<->mas_bimc_snoc
> setting mas_bimc_snoc<->slv_snoc_cnoc
> setting slv_snoc_cnoc<->mas_snoc_cnoc
> setting mas_snoc_cnoc<->qhs_sdc2
> 

I think with our proposed changes above it does no longer matter if a
node is passed as "src" or "dst". This means in your example above you
just waste additional time setting the bandwidth twice for
slv_bimc_snoc, mas_bimc_snoc, slv_snoc_cnoc and mas_snoc_cnoc.
The final outcome is the same with or without "inter_set".

Thanks,
Stephan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ