lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <adc7ca5c-69d8-4164-bd89-3381b101576a@nvidia.com>
Date: Thu, 11 Dec 2025 07:46:09 +0000
From: Jon Hunter <jonathanh@...dia.com>
To: Aaron Kling <webgeek1234@...il.com>
Cc: Krzysztof Kozlowski <krzk@...nel.org>, Rob Herring <robh@...nel.org>,
 Conor Dooley <conor+dt@...nel.org>, Thierry Reding
 <thierry.reding@...il.com>, linux-kernel@...r.kernel.org,
 devicetree@...r.kernel.org, linux-tegra@...r.kernel.org
Subject: Re: [PATCH v4 3/5] memory: tegra186-emc: Support non-bpmp icc scaling


On 10/12/2025 22:41, Aaron Kling wrote:
> On Wed, Dec 10, 2025 at 3:24 PM Jon Hunter <jonathanh@...dia.com> wrote:
>>
>>
>> On 10/12/2025 18:32, Aaron Kling wrote:
>>> On Wed, Dec 10, 2025 at 9:04 AM Jon Hunter <jonathanh@...dia.com> wrote:
>>>>
>>>>
>>>> On 10/12/2025 05:06, Aaron Kling wrote:
>>>>
>>>> ...
>>>>
>>>>> Let me try to iterate the potential issues I've seen stated here. If
>>>>> I'm missing anything, please fill in the blanks.
>>>>>
>>>>> 1) If this change is applied without the related dt change and the
>>>>> pcie drvier is loaded, the emc clock can become stuck at the lowest
>>>>> rate. This is caused by the pcie driver providing icc data, but
>>>>> nothing else is. So the very low requested bandwidth results in the
>>>>> emc clock being set very low. I'm not sure there is a 'fix' for this,
>>>>> beyond making sure the dt change is merged to ensure that the cpufreq
>>>>> driver provides bandwidth info, causing the emc driver to select a
>>>>> more reasonable emc clock rate. This is a similar situation to what's
>>>>> currently blocking the tegra210 actmon series. I don't think there is
>>>>> a way for the drivers to know if icc data is missing/wrong. The
>>>>> scaling is doing exactly what it's told based on the icc routing given
>>>>> in the dt.
>>>>
>>>> So this is the fundamental issue with this that must be fixed. We can't
>>>> allow the PCIe driver to slow the system down. I think that Krzysztof
>>>> suggested we need some way to determine if the necessary ICC clients are
>>>> present/registered for ICC to work. Admittedly, I have no idea if there
>>>> is a simple way to do this, but we need something like that.
>>>
>>> I'm not sure I understand how checking clients would work. Is there a
>>> mechanism for the emc driver to know if cpufreq is registered to icc
>>> in a way that works with probe deferrals, but also allows for it to be
>>> optional?
>>
>> I am not sure if such a mechanism exists either, but it seems that we
>> need something like this.
>>
>>> Alternatively if there is not, can we just accept the abi break and
>>> have this and the dt change depend on each other? I know it's not
>>> desirable or the first choice, but if the other option is to rewrite
>>> part of the icc system, then perhaps it should be an option.
>>
>> I am not sure it is an ABI break, but the default performance might be
>> worse. I am not sure if you are proposing a way to enforce the
>> dependency or just saying that there is a dependency. We can't do the
>> latter, but if there is a way for the kernel to check the dependency and
>> make the right choice, then that should work.
> 
> So we can't accept that older dt's will run slower on a newer kernel
> and say that a newer dt is needed for full performance?
> 
> If that's not an option, then I have no idea how to resolve this. I'm
> not greatly knowledgeable about the icc subsystem. I can try to look
> into options, but I'm not greatly optimistic about me finding one. If
> someone could suggest a concept on how to make it work, I could
> implement it. But I'm not even seeing the concept right now.
> 
>>>>> 2) Jon, you report that even with both this change and the related dt
>>>>> change, that the issue is still not fixed. But then posted a log
>>>>> showing that the emc rate is set to max. If the issue is that emc rate
>>>>> is too low, then how can debugfs report that the rate is max? For
>>>>> reference, everything scales as expected for me given this change plus
>>>>> the dt change on both p2771 and p3636+p3509.
>>>>
>>>> To clarify, this broke the boot test on Tegra194 because the boot was
>>>> too slow. However, this also broke the EMC test on Tegra186 because
>>>> setting the frequency from the debugfs failed. So two different failures
>>>> on two different devices. I am guessing the EMC test would also fail on
>>>> Tegra194, but given that it does not boot, we did not get that far.
>>>
>>> So you're saying that even with the dt changes, this change on
>>> tegra194 still does not boot before the regression test framework
>>> times out? If so, I need some more details about this. I have not seen
>>> issues on p2972 or p3518. For example, if I boot to android recovery
>>> where I set the cpufreq governor to performance, I see emc clock rate
>>> set to 2133 MHz and 1600 MHz respectively. And boot time from kernel
>>> start to pixels on display is 15 seconds, give or take a couple
>>> seconds. This is using the boot stack from l4t r32.7.6.
>>
>> Yes. The boot failure here is not a hard boot failure, but the device
>> takes too long to boot and the boot test times out. And no we will not
>> increase the timeout as it is there for a reason. It could well be
>> because the default governor is not set to performance. If you boot with
>> just using the stock 'defconfig' for ARM64 without setting the governor
>> does it take longer?
> 
> So, I checked out next-20251210, then b4 shazam'ed this series and the
> matching dt series,
> 20251021-tegra186-icc-p3-v3-0-68184ee8a89c@...il.com. Then built with
> LLVM=1 ARCH=arm64 make defconfig
> LLVM=1 ARCH=arm64 make -j33 Image nvidia/tegra194-p2972-0000.dtb
> 
> I packaged them into an android boot image using a lightly modified
> copy of Gnurou's bbinitramfs which just drops to a busybox shell. Note
> that this includes no modules, and since the pcie driver is =m in
> defconfig, it is not included. Then I flashed that with the l4t
> r32.7.6 boot stack to p2972. I got the shell on uart after 4.275
> seconds in the kernel. Per sysfs, the cpufreq governor is schedutil
> and all policies are idling at min freq, 115200. And per debugfs, the
> emc clock is 800000000. All this looks to be as expected.
> 
> I have no idea why the regression test setup is timing out. I have not
> seen the issue through any of my testing. On pure mainline as per the
> above paragraph, or with the patches on the android common kernel, as
> per my target use case. I don't know what to do if I can't replicate
> the issue. I don't suppose the flash package for the regression test
> setup is something that could be released?

I thought we already concluded that you did not see this because you did 
not have the PCIe module present in your testing? From the above its 
sounds like you still don't have that driver present and so you don't 
see the issue. I guess I am not surprised by that but I am not sure why 
you are now saying you have no idea why this is timing out? I thought 
this was understood.

-- 
nvpublic


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ