lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALHNRZ_vZm7n-fZSVA1YzUPz0=znX_D6aBZ0nwUyjKdwcrO1=w@mail.gmail.com>
Date: Thu, 18 Dec 2025 13:25:32 -0600
From: Aaron Kling <webgeek1234@...il.com>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Krzysztof Kozlowski <krzk@...nel.org>, Rob Herring <robh@...nel.org>, Conor Dooley <conor+dt@...nel.org>, 
	Thierry Reding <thierry.reding@...il.com>, linux-kernel@...r.kernel.org, 
	devicetree@...r.kernel.org, linux-tegra@...r.kernel.org
Subject: Re: [PATCH v4 3/5] memory: tegra186-emc: Support non-bpmp icc scaling

On Thu, Dec 18, 2025 at 5:12 AM Jon Hunter <jonathanh@...dia.com> wrote:
>
>
> On 17/12/2025 22:44, Aaron Kling wrote:
>
> ...
>
> >> Thanks I added all these on top of next-20251216 (as that is the latest
> >> I have tested) and Tegra194 fails to boot. We always include all the
> >> modules in the rootfs that is being tested. You can see the boot log
> >> here [0]. We are using an NFS rootfs for testing and I see a message
> >> related to the NFS server not responding. I am guessing something is
> >> running too slow again because the only thing I changed was adding your
> >> patches. The test harness reports it is timing out ...
> >>
> >> FAILED: Linux Boot Test 1
> >>          Test Owner(s): N/A
> >>          Execution Time 219.31 sec
> >>          Test TIMEOUT reached. Test did not report results in 120 secs
> >>          Percent passed so far: 0.0
> >
> > Okay, so. Modules are in the rootfs, none get copied to the initramfs?
> > And the rootfs is on nfs? And for this failure, nfs never gets
> > mounted. So... for this case, no modules get loaded, implying that
> > whatever is happening is happening with the built-in drivers. Which
> > means this case isn't pcie related. Are there any modifications to the
> > defconfig? It appears that there must be, to have dwc-eth-dwmac
> > available. I will see if I can trigger anything when using ethernet.
>
> If you look at the boot log you will see ...
>
> [    7.839012] Root device found: nfs
> [    7.908307] Ethernet interface: eth0
> [    7.929765] IP Address: 192.168.99.2
> [    8.173978] Rootfs mounted over nfs
> [    8.306291] Switching from initrd to actual rootfs
>
> So it does mount the rootfs and so the modules would be loaded. I

But the bottom of the log says:
[ 188.360095] nfs: server 192.168.99.1 not responding, still trying

So does it mount nfs and load modules, and *then* fail to talk to the
nfs server? That doesn't make any sense. And I don't see any logs from
driver probes after the rootfs line. And there's sync_state lines
stating that pcie among others isn't available.

> believe that PCIe is definitely loaded because that is what I observed
> before. And yes there are a few modifications to the defconfig that we
> make on top (that have been added over the years for various reasons) ...
>
> CONFIG_ARM64_PMEM=y
> CONFIG_BROADCOM_PHY=y
> CONFIG_DWMAC_DWC_QOS_ETH=y
> CONFIG_EEPROM_AT24=m
> CONFIG_EXTRA_FIRMWARE="nvidia/tegra210/xusb.bin nvidia/tegra186/xusb.bin
> nvidia/tegra194/xusb.bin rtl_nic/rtl8153a-3.fw rtl_nic/rtl8168h-2.fw"
> CONFIG_EXTRA_FIRMWARE_DIR="${KERNEL_FW_DIR}"
> CONFIG_MARVELL_PHY=y
> CONFIG_R8169=y
> CONFIG_RANDOMIZE_BASE=n
> CONFIG_SERIAL_TEGRA_TCU=y
> CONFIG_SERIAL_TEGRA_TCU_CONSOLE=y
> CONFIG_STAGING=y
> CONFIG_STAGING_MEDIA=y
> CONFIG_STMMAC_ETH=y
> CONFIG_STMMAC_PLATFORM=y
> CONFIG_USB_RTL8152=y
> CONFIG_VIDEO_TEGRA=m
> CONFIG_VIDEO_TEGRA_TPG=y
> CONFIG_DWMAC_TEGRA=y

I will incorporate these to a build and see if I get any different results.

> Looking at the boot log I see ...
>
> [    3.854658] cpu cpu0: cpufreq_init: failed to get clk: -2
> [    3.854927] cpu cpu0: cpufreq_init: failed to get clk: -2
> [    3.855218] cpu cpu2: cpufreq_init: failed to get clk: -2
> [    3.858438] cpu cpu2: cpufreq_init: failed to get clk: -2
> [    3.863987] cpu cpu4: cpufreq_init: failed to get clk: -2
> [    3.869741] cpu cpu4: cpufreq_init: failed to get clk: -2
> [    3.875006] cpu cpu6: cpufreq_init: failed to get clk: -2
> [    3.880725] cpu cpu6: cpufreq_init: failed to get clk: -2
> [    3.886018] cpufreq-dt cpufreq-dt: failed register driver: -19
>
> So actually, I am now wondering if this is the problem?

These lines are from cpufreq-dt trying to manage the cpu's directly,
which it's not supposed to do. tegra194-cpufreq is supposed to manage
them. I see these lines as well, when things are operating as
expected. The real driver doesn't log anything, but the policies are
visible in sysfs. I did a little bit of digging previously to see if I
could remove the log churn, but was unable to do so. I would have to
double check to be completely sure, but I am fairly certain I saw
these lines before my changes as well. It's something that would be
good to get fixed, but I don't think it's operable here.

Aaron

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ