[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABjd4Yz4NbqzZH4Qsed3ias56gcga9K6CmYA+BLDBxtbG915Ag@mail.gmail.com>
Date: Mon, 23 Jun 2025 17:58:19 +0400
From: Alexey Charkov <alchark@...il.com>
To: Heiko Stuebner <heiko@...ech.de>
Cc: Nicolas Frattaroli <nicolas.frattaroli@...labora.com>,
XiaoDong Huang <derrick.huang@...k-chips.com>, Piotr Oniszczuk <piotr.oniszczuk@...il.com>,
Rob Herring <robh@...nel.org>, Krzysztof Kozlowski <krzk+dt@...nel.org>, Conor Dooley <conor+dt@...nel.org>,
devicetree@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-rockchip@...ts.infradead.org, linux-kernel@...r.kernel.org,
Jonas Karlman <jonas@...boo.se>
Subject: Re: [PATCH 1/4] arm64: dts: rockchip: list all CPU supplies on ArmSoM Sige5
On Mon, Jun 23, 2025 at 1:19 PM Alexey Charkov <alchark@...il.com> wrote:
>
> On Sun, Jun 22, 2025 at 5:48 PM Heiko Stuebner <heiko@...ech.de> wrote:
> >
> > Am Samstag, 21. Juni 2025, 23:21:11 Mitteleuropäische Sommerzeit schrieb Alexey Charkov:
> > > On Sat, Jun 21, 2025 at 11:44 PM Heiko Stuebner <heiko@...ech.de> wrote:
> > > >
> > > > Am Samstag, 21. Juni 2025, 21:35:56 Mitteleuropäische Sommerzeit schrieb Alexey Charkov:
> > > > > On Fri, Jun 20, 2025 at 8:02 PM Alexey Charkov <alchark@...il.com> wrote:
> > > > > >
> > > > > > On Wed, Jun 18, 2025 at 6:48 PM Alexey Charkov <alchark@...il.com> wrote:
> > > > > > >
> > > > > > > On Wed, Jun 18, 2025 at 6:06 PM Nicolas Frattaroli
> > > > > > > <nicolas.frattaroli@...labora.com> wrote:
> > > > > > > >
> > > > > > > > Hello,
> > > > > > > >
> > > > > > > > +Cc Jonas Karlman as he is intimately familiar with RK3576 clock shenanigans by now,
> > > > > > > >
> > > > > > > > On Wednesday, 18 June 2025 15:51:45 Central European Summer Time Alexey Charkov wrote:
> > > > > > > > > On Sun, Jun 15, 2025 at 8:00 PM Piotr Oniszczuk
> > > > > > > > > <piotr.oniszczuk@...il.com> wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Wiadomość napisana przez Alexey Charkov <alchark@...il.com> w dniu 9 cze 2025, o godz. 16:05:
> > > > > > > > > > >
> > > > > > > > > > > On Sun, Jun 8, 2025 at 11:24 AM Piotr Oniszczuk
> > > > > > > > > > > <piotr.oniszczuk@...il.com> wrote:
> > > > > > > > > > >>> Wiadomość napisana przez Alexey Charkov <alchark@...il.com> w dniu 5 cze 2025, o godz. 15:42:
> > > > > > > > > > >>>> Alexey,
> > > > > > > > > > >>>> I see you are using rk3576 board like me (nanopi-m5)
> > > > > > > > > > >>>> Have you on your board correctly working cpu dvfs?
> > > > > > > > > > >>>> I mean: [1][desired clocks reported by kernel sysfs are in pair with [2[]cur clocks?
> > > > > > > > > > >>>> In my case i see mine cpu lives totally on it’s own with dvfs:
> > > > > > > > > > >>>
> > > > > > > > > > >>> Hi Piotr,
> > > > > > > > > > >>>
> > > > > > > > > > >>> I haven't tried to validate actual running frequencies vs. requested
> > > > > > > > > > >>> frequencies, but subjective performance and power consumption seem to
> > > > > > > > > > >>> be in line with what I expect.
> > > > > > > > > > >>
> > > > > > > > > > >> well - my subjective l&f is that - currently - my rk3576 seems „slower" than i.e. 4xA53 h618.
> > > > > > > > > > >
> > > > > > > > > > > In my experience, native compilation of GCC 14 using 8 threads on
> > > > > > > > > > > RK3576 (mainline with passive cooling and throttling enabled): 2 hours
> > > > > > > > > > > 6 minutes, on RK3588 (mainline with passive cooling via Radxa Rock 5B
> > > > > > > > > > > case and throttling enabled but never kicking in): 1 hour 10 minutes
> > > > > > > > > >
> > > > > > > > > > by curiosity i looked randomly on 3576 vs 3588:
> > > > > > > > > > multithread passmark: 3675 (https://www.cpubenchmark.net/cpu.php?cpu=Rockchip+RK3576&id=6213)
> > > > > > > > > > multithread passmark: 4530 (https://www.cpubenchmark.net/cpu.php?cpu=Rockchip+RK3588&id=4906)
> > > > > > > > > >
> > > > > > > > > > assuming 3588 as baseline, 3576 is approx 20% slower on multithread passmark (has ~0,8 comp power of 3588)
> > > > > > > > > > 70 min compile on 3588 should take something like ~86min on 3576.
> > > > > > > > > > In your case 126min compile on 3576 shows 3576 offers 0,55 comp power of 3588.
> > > > > > > > > > Roughly 3576 should do this task in 40min less than you currently see i think
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > Can't see how u-boot would affect CPU speed in Linux, as long as you
> > > > > > > > > > > use comparable ATF images. Do you use the same kernel and dtb in all
> > > > > > > > > > > these cases? Also, what's your thermal setup?
> > > > > > > > > >
> > > > > > > > > > yes. in all cases only change was: uboot & atf
> > > > > > > > > > thermal is based on recent collabora series (+ recent pooling fix for clocks return from throttling)
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Not sure UX is a particularly good measure of CPU performance, as long
> > > > > > > > > > > as you've got a properly accelerated DRM graphics pipeline. More
> > > > > > > > > > > likely 2D/3D and memory.
> > > > > > > > > >
> > > > > > > > > > indeed.
> > > > > > > > > > For quantified look i’m looking on v.simple approach to estimate real clock is http://uob-hpc.github.io/2017/11/22/arm-clock-freq.html
> > > > > > > > > > by curiosity i looked what it reports on a53/a55/a72/a76 and it is surprisingly accurate :-)
> > > > > > > > > > on mine 3576 with collabora uboot+mainline atf is hows 800MHz (and in perf. gov it seems to be constant)
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > There might be some difference in how PVTPLL behaves on RK3576 vs.
> > > > > > > > > > > RK3588. But frankly first I would check if you are using comparable
> > > > > > > > > > > ATF implementations (e.g. upstream TF-A in both cases), kernels and
> > > > > > > > > > > thermal environment :)
> > > > > > > > > >
> > > > > > > > > > all tests: the same 6.15.2 mainline + some collabora patches
> > > > > > > > > >
> > > > > > > > > > diffs were:
> > > > > > > > > > 1.collabora uboot[1] + mainline atf 2.13
> > > > > > > > > > 2.collabora uboot[1] + rockchip rkbin bl31 blob
> > > > > > > > > > 3.vendor uboot (bin dump from friendlyelec ubuntu image)
> > > > > > > > > >
> > > > > > > > > > on 1/2 i see kind of issue with clock values (i.e. perf gov gives constant 800MHz on mainline atf).
> > > > > > > > > > 3 seems to perform better - (i.e. perf gov gives constant 1500MHz so all is snappier/faster)
> > > > > > > > >
> > > > > > > > > There is indeed something weird going on. I've tried running sbc-bench
> > > > > > > > > [1], and even though I observe dynamically varying CPU frequencies
> > > > > > > > > after boot with schedutil governor, once sbc-bench switches the
> > > > > > > > > governor to "performance" and goes through the OPPs in descending
> > > > > > > > > frequency order, the CPUs seem to get stuck at the last applied low
> > > > > > > > > frequency. Even after max frequency gets reverted from 408 MHz to
> > > > > > > > > something higher, even after I switch the governor to something else -
> > > > > > > > > no matter what. Only a reboot gets the higher frequencies 'unstuck'
> > > > > > > > > for me.
> > > > > > > > >
> > > > > > > > > These are all observed at around 55C SoC temperature, so throttling is
> > > > > > > > > not an issue. Regulators are stuck at 950000 uV - way above 700000 uV
> > > > > > > > > that the 408 MHz OPP requires (and power readings seem to match: I'm
> > > > > > > > > getting about 2.3W consumption at 408 MHz in idle vs. normal idle
> > > > > > > > > reading of 1.4W at around 1 GHz).
> > > > > > > > >
> > > > > > > > > Not sure what's going on here, and I don't remember seeing anything
> > > > > > > > > similar on RK3588. Thoughts welcome.
> > > > > > > >
> > > > > > > > This may once again be a "accidentally uses wrong clock IDs" type
> > > > > > > > situation. The other possibility is that we're getting confused
> > > > > > > > between what we think the clock rate is and what SCMI actually set
> > > > > > > > the clock rate to.
> > > > > > > >
> > > > > > > > Things to check is whether the right clock controller (scmi vs cru)
> > > > > > > > and the right clock id (check ATF source for this) is used.
> > > > > > >
> > > > > > > Clock IDs in the kernel seem to match those in ATF, but I've noticed
> > > > > > > what appears to be a buffer overflow in some of the SCMI clock names
> > > > > > > defined in the opensource TF-A (thanks GCC 15 and its zealous
> > > > > > > warnings):
> > > > > >
> > > > > > After some more testing, I tend to confirm what Piotr observed
> > > > > > earlier. Namely, frequency scaling acts weird on any ATF version (be
> > > > > > it binary BL31 or opensource TF-A), as long as mainline u-boot is
> > > > > > used. Using the u-boot binary extracted from the ArmSoM QWRT image
> > > > > > does not lead to "stuck" CPU frequencies when running sbc-bench.
> > > > > >
> > > > > > I'm getting this with the exact same kernel build (6.16-rc1 with some
> > > > > > Sige5 related patches, namely v2 of this series, Nicolas' USB
> > > > > > enablement series and TSADC). The only other difference is that the
> > > > > > binary u-boot doesn't have EFI support, so I had to boot into the
> > > > > > ARM64 uncompressed Image instead of vmlinuz.efi, but those were both
> > > > > > taken from the same build.
> > > > > >
> > > > > > What I'm observing during the sbc-bench run:
> > > > > > - It switches the cpufreq governor from schedutil to performance
> > > > > > - It goes through all CPU OPPs in descending frequency order
> > > > > > --- While it does that when booted using mainline u-boot +
> > > > > > vmlinuz.efi: "hardware limits" line in "cpupower -c 0,4
> > > > > > frequency-info" changes with each OPP change (the max frequency
> > > > > > getting reduced sequentially), then it resets to the initial full
> > > > > > range, but the actual frequency stays stuck at the lowest possible
> > > > > > value
> > > > > > --- While it does that when booted using binary u-boot + Image:
> > > > > > "hardware limits" line in "cpupower -c 0,4 frequency-info" doesn't
> > > > > > change, but the actual frequency gets reduced sequentially. Then after
> > > > > > the iteration over all OPPs is completed it returns to the highest
> > > > > > possible value, and adjusts dynamically based on thermal throttling as
> > > > > > the benchmark progresses
> > > > >
> > > > > Slight correction: it's not the "hardware limits" line, but rather
> > > > > "current policy".
> > > > >
> > > > > Note that booting mainline u-boot in non-EFI mode (using plain Image)
> > > > > doesn't change the results above.
> > > >
> > > > I'm in a similar boat, while trying to make DSI run on the rk3576.
> > > > Andy from Rockchip was able to make it work "just" by using vendor-
> > > > firmware - while using mainline u-boot (with both mainline TF-A
> > > > or vendor TF-A) produces just black output.
> > > >
> > > > I think when I did the mainline u-boot thing I took the "vendor"-code
> > > > from the armbian rk3576 vendor-u-boot ... but that actually may differ
> > > > from what the vendors provided?
> > >
> > > Just tried booting into u-boot built from ArmSoM sources at [1] - same
> > > issues as using mainline. Either I'm doing something stupid building
> > > it (don't know what though), or the binary shipped in ArmSoM images is
> > > indeed different from what the sources are.
> >
> > Can you list the versions you see for the _working_ binaries?
>
> Those are not as detailed as I'd like, as vendor builds seem to
> override the default U-boot versioning rules (stripping the git hash
> information off).
>
> I've identified the following versions for the binary image that worked for me:
>
> DDR init v1.03 (version string "DDR 81dd75088a wesley.yao
> 24/04/16-11:19:41,fwver: v1.03")
> U-boot SPL 2017.09 (no traces of any particular git commit, version
> string "U-Boot SPL 2017.09 (Aug 23 2024 - 00:47:39)")
> U-boot proper 2017.09 (ditto, version string "U-Boot 2017.09 (Aug 23
> 2024 - 00:47:39 +0000) OpenWrt")
> BL31 image v1.04 (version string
> "v2.3():v2.3-749-g30018b771:derrick.huang, fwver: v1.04")
> BL32 image v1.01 (version string "U3.13.0-791-g185dc3c92 #hisping.lin
> (gcc version 10.2.1 20201103 (GNU Toolchain for the A-profile
> Architecture 10.2-2020.11 (arm-10.16))) #2 Tue Apr 16 11:05:25 CST
> 2024 aarch64, fwver: v1.01")
>
> > I.e. u-boot and friends may list someversion-gGITHASH thingy like your
> > OPTEE already does: OP-TEE version: 3.13.0-791-g185dc3c92 .
> > Also possibly the build date.
>
> The build date is later than any commit I see in the ArmSoM repo
> leading up to either of the two tags they have there ("rk3576" and
> "rk3576-6.1-rk3.1", neither of them belonging to any published branch
> there)
>
> > I.e. I'm wondering/hoping if we can match to some git commit.
>
> I've also noticed that ArmSoM SDK references the "boost" binary from
> rkbin (rk3576_boost_v1.00.bin in particular), but it doesn't seem to
> be present in the bootloader binary I'm looking at. The version string
> of the SPL binary in rkbin as included in the SDK is also different
> from what I'm seeing in the actual binary: SDK has "U-Boot SPL
> 2017.09-g57b8ceed1c3-240418 #lxh (Apr 19 2024 - 17:07:22), fwver:
> v1.02".
>
> I'll try and rebuild the vendor u-boot from source with these specific
> versions of rkbin blobs (ddr v1.03, bl31 v1.04 and bl32 v1.01), but
> the changelog in rkbin didn't mention anything that would seem
> relevant to the issue I'm observing :(
Okay, I've bisected this.
TLDR: Linux and u-boot seem to have nothing to do with it, opensource
TF-A doesn't work, binary BL31 starting with v1.09 do not work. BL31
v1.08 and earlier work fine. Tested with vendor u-boot and with
2025.07-rc2, results are similar. OP-TEE presence does not affect the
results.
To test:
1. Build and apply a fresh u-boot-rockchip.bin using the DDR init and
BL31 from rkbin repo [1]
2. Run "watch cpupower -c 0,4 frequency-info" to observe how CPU
frequencies change
3. In a separate terminal, run "bash sbc-bench.sh -c" [2]
4. "Good" behavior:
a. Once the cpufreq governor is switched to "performance" (if it
was anything different at boot), both CPU clusters go up to their max
frequencies
b. While the benchmark runs through the stage where it says
"Checking cpufreq OPP", CPU frequencies consecutively decrease from
the max OPP frequency to the min
c. After the benchmark finishes the stage where it says "Checking
cpufreq OPP", both CPU clusters go up to their max frequencies for the
rest of the benchmarking run
"Bad" behavior:
a. Once the cpufreq governor is switched to "performance" (if it
was anything different at boot), both CPU clusters stay at whatever
frequencies they were at when the switch occurred
b. While the benchmark runs through the stage where it says
"Checking cpufreq OPP", CPU frequencies consecutively decrease from
the "pre-max" OPP frequency to the min (max frequency never gets
applied on either of the clusters!)
c. After the benchmark finishes the stage where it says "Checking
cpufreq OPP", both CPU clusters stay at their min frequencies for the
rest of the benchmarking run
I suspect it has something to do with the topmost OPP somehow becoming
unselectable in ATF starting from binary BL31 v1.09 (and similarly in
opensource TF-A). This time I couldn't reproduce the "permanently
stuck" CPU frequencies though.
Here are the git bisect results (running in rkbin repo):
0b338fd6ae7ad816295a75c80b00968bb1868625 is the first bad commit
commit 0b338fd6ae7ad816295a75c80b00968bb1868625 (HEAD)
Author: XiaoDong Huang <derrick.huang@...k-chips.com>
Date: Fri Jun 28 17:47:55 2024 +0800
rk3576: bl31: update version to v1.09
Build from ATF commit:
286c013b2 plat: rockchip: rk3576: Add support to set low length
update feature:
286c013b2 plat: rockchip: rk3576: Add support to set low length
5fb78cd64 rockchip: uart: switch to loopback mode before
checking busy status
02ff3b0da rockchip: uart: save uart register once
23bd4684e plat: rk3576: support to switch hptimer to 1G
b4833e988 plat: rk3576: remove AUTOCS_EN_COUNT
437ed59cb plat: rockchip: rk3576: support aclk_sencure_s enable
6d8c67d6a plat: rk3576: support poweroff scu1 when cpu4~cpu7 off
Change-Id: I0fcb5afefcc3865cf8c413b4c0ed00234d2675cb
Signed-off-by: XiaoDong Huang <derrick.huang@...k-chips.com>
RKTRUST/RK3576TRUST.ini | 2 +-
bin/rk35/{rk3576_bl31_v1.08.elf => rk3576_bl31_v1.09.elf} | Bin
349032 -> 349032 bytes
doc/release/RK3576_CN.md | 13 +++++++++++++
doc/release/RK3576_EN.md | 14 ++++++++++++++
4 files changed, 28 insertions(+), 1 deletion(-)
rename bin/rk35/{rk3576_bl31_v1.08.elf => rk3576_bl31_v1.09.elf} (82%)
[1] https://github.com/rockchip-linux/rkbin.git
[2] https://github.com/ThomasKaiser/sbc-bench
Derrick, if you could chime in on what might be going on here, that
would be greatly appreciated :)
Best regards,
Alexey
Powered by blists - more mailing lists