[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CALHNRZ9KAv-hL6+6Uiaz2O2odm1rqMnjNxNVPsbCOdqX15KTuw@mail.gmail.com>
Date: Fri, 21 Nov 2025 12:17:56 -0600
From: Aaron Kling <webgeek1234@...il.com>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Krzysztof Kozlowski <krzk@...nel.org>, Rob Herring <robh@...nel.org>, Conor Dooley <conor+dt@...nel.org>,
Thierry Reding <thierry.reding@...il.com>, linux-kernel@...r.kernel.org,
devicetree@...r.kernel.org, linux-tegra@...r.kernel.org
Subject: Re: [PATCH v4 3/5] memory: tegra186-emc: Support non-bpmp icc scaling
On Fri, Nov 21, 2025 at 5:21 AM Jon Hunter <jonathanh@...dia.com> wrote:
>
>
> On 12/11/2025 07:21, Aaron Kling wrote:
> > On Wed, Nov 12, 2025 at 12:18 AM Jon Hunter <jonathanh@...dia.com> wrote:
> >>
> >>
> >> On 11/11/2025 23:17, Aaron Kling wrote:
> >>
> >> ...
> >>
> >>> Alright, I think I've got the picture of what's going on now. The
> >>> standard arm64 defconfig enables the t194 pcie driver as a module. And
> >>> my simple busybox ramdisk that I use for mainline regression testing
> >>> isn't loading any modules. If I set the pcie driver to built-in, I
> >>> replicate the issue. And I don't see the issue on my normal use case,
> >>> because I have the dt changes as well.
> >>>
> >>> So it appears that the pcie driver submits icc bandwidth. And without
> >>> cpufreq submitting bandwidth as well, the emc driver gets a very low
> >>> number and thus sets a very low emc freq. The question becomes... what
> >>> to do about it? If the related dt changes were submitted to
> >>> linux-next, everything should fall into place. And I'm not sure where
> >>> this falls on the severity scale since it doesn't full out break boot
> >>> or prevent operation.
> >>
> >> Where are the related DT changes? If we can get these into -next and
> >> lined up to be merged for v6.19, then that is fine. However, we should
> >> not merge this for v6.19 without the DT changes.
> >
> > The dt changes are here [0].
>
> To confirm, applying the DT changes do not fix this for me. Thierry is
> having a look at this to see if there is a way to fix this.
>
> BTW, I have also noticed that Thierry's memory frequency test [0] is
> also failing on Tegra186. The test simply tries to set the frequency via
> the sysfs and this is now failing. I am seeing ...
>
> memory: emc: - available rates: (* = current)
> memory: emc: - 40800000
> memory: emc: - 68000000
> memory: emc: - 102000000
> memory: emc: - 204000000
> memory: emc: - 408000000
> memory: emc: - 665600000
> memory: emc: - 800000000
> memory: emc: - 1062400000
> memory: emc: - 1331200000
> memory: emc: - 1600000000
> memory: emc: - 1866000000 *
> memory: emc: - testing:
> memory: emc: - 40800000...OSError: [Errno 34] Numerical result out
> of range
Question. Does this test run and pass on jetson-tk1? I based the
tegra210 and tegra186 [0] code on tegra124 [1]. And I don't see a
difference in the flow now. What appears to be happening is that icc
is reporting a high bandwidth, setting the emc min_freq to something
like 1600MHz. Then debugfs is having max_freq set to something low
like 40.8MHz. Then the linked code block fails because the higher of
the min_freqs is greater than the lower of the max_freqs. But if this
same test is run on jetson-tk1, I don't see how it passes. Unless
maybe the t124 actmon is consistently setting min freqs during the
tests.
An argument could be made that any attempt to set debugfs should win a
conflict with icc. That could be done. But if that needs done here,
I'd argue that it needs replicated across all other applicable emc
drivers too.
Aaron
[0] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/memory/tegra/tegra186-emc.c?h=next-20251121#n78
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/memory/tegra/tegra124-emc.c?h=v6.18-rc6#n1066
Powered by blists - more mailing lists