lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <acc3c98e-e662-42aa-a1a3-a5ccb1a7b0fc@solid-run.com>
Date: Fri, 24 Oct 2025 21:36:50 +0000
From: Josua Mayer <josua@...id-run.com>
To: Andrew Lunn <andrew@...n.ch>
CC: Gregory Clement <gregory.clement@...tlin.com>, Sebastian Hesselbarth
	<sebastian.hesselbarth@...il.com>, Rob Herring <robh@...nel.org>, Krzysztof
 Kozlowski <krzk+dt@...nel.org>, Conor Dooley <conor+dt@...nel.org>, Frank
 Wunderlich <frank-w@...lic-files.de>, "linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "devicetree@...r.kernel.org"
	<devicetree@...r.kernel.org>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "stable@...r.kernel.org"
	<stable@...r.kernel.org>
Subject: Re: [PATCH v2 3/4] arm64: dts: marvell: cn9132-clearfog: fix
 multi-lane pci x2 and x4 ports

Hi Andrew,

Am 18.09.25 um 19:40 schrieb Josua Mayer:
> Am 18.09.25 um 17:41 schrieb Andrew Lunn:
>
>>>>> The mvebu-comphy driver does not currently know how to pass correct
>>>>> lane-count to ATF while configuring the serdes lanes.
>>>> Why not just teach mvebu-comphy to pass the correct line-count? That
>>>> sounds like the proper fix, and that makes the kernel independent of
>>>> the bootloader.
>>> That would be a feature on the comphy driver, not a bug-fix backported
>>> to stable. The core goal was to fix bugs found in Debian 13.
>> It is not so simple.
>>
>> https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
>>
>>   It must either fix a real bug that bothers people or just add a device ID
>>
>> Crashing at boot would be a real bug that bothers people, not just a
>> new feature.
>>
>> Lets see how big the patch is. If its 1000 lines of hard to understand
>> code, it will probably be rejected for stable. If its 100 lines or
>> less, it will likely be accepted.
> I see.
>> It is also hard to argue the DT is wrong. It just describes the
>> hardware. I assume the description is actually correct?
> The x4 port linked  comphy as below:
>
> phys = <&cp0_comphy0 0>, <&cp0_comphy1 0>, <&cp0_comphy2 0>, <&cp0_comphy3 0>;
>
> At the time of submitting my patch I was  not convinced the above was right, or wrong.
> I labeled it wrong for causing a fault which I should have noticed much earlier.
>
> The  numeric argument after the comphy-lane handle is the port number,
> for those functions that can have multiple ports (e.g. ethernet #2).
>
> This means above dts linked pci port 0 on lanes 0-4, which appears correct.
> Further lanes 1-3 have no other pci ports, there is no other configuration to confuse it with.
>
>> The issue is
>> the driver, not the description. Also, i assume this affects all
>> boards using this SoC? Removing the nodes in one board 'fixes' one
>> board. Fixing the driver fixes all boards...
> I missed to check whether other boards share similar description.
>
> Today I found two other dts that reference multiple lanes:
>
> arch/arm64/boot/dts/marvell/armada-8040-puzzle-m801.dts
> arch/arm64/boot/dts/marvell/armada-8040-mcbin.dtsi
>
> Both cases the function is PCI - first one x2, secondx4.
>
> I will try to look into a more correct solution soon.

I did some extensive testing outside of Debian world, with v6.12.0, v6.12.48, Marvell v6.1 ...
and made a range of interesting / confusing observations:

1. I was able to produce the problem in a self-compiled kernel outside Debian.
This is with arm64 defconfig and some small adaptations.

The system only got stuck during boot when the comphy driver was a module.

In this case there are two suspicious messages in boot log:

[    2.742966] armada8k-pcie f4600000.pcie: No available PHY
[    3.732084] armada8k-pcie f4600000.pcie: Phy link never came up

The link timeout comes to mind first, which is unexpected as in my testing
there was always a card connected.
This card was detected fine with comphy driver builtin.

The "No available PHY" likely leads to some bad error handling in pci probe,
and should be investigated further.

2. When the system got stuck during boot, it was never in the middle of an smc to atf.
I confirmed this by adding locking to the smc function handler in atf, and logging
activity to serial console.

3. I was wrong in that the linux driver does not know how to configure the lane count,
the comphy driver does indeed pass the port width (as indicated by num-lanes dt prop)
in the format that ATF expects.

However ATF does nothing with kernel driver pci lane configuration.
Instead any power-on or power-off call from kernel driver via smcc
is tested for originating within linux vs. uboot. Only when source is uboot,
it performs any configuration ... :

https://github.com/ARM-software/arm-trusted-firmware/blob/master/drivers/marvell/comphy/phy-comphy-cp110.c#L1257

4. The "mode" argument (x2) to the smc function for comphy lane power-on/setup
differs between Marvell U-Boot and Linux. I found this by dumping them from ATF itself.

In particular the bits indicating the port number were invalid due to an overflow error
in solidrun u-boot (based on Marvell u-boot).
The mode specified by kernel driver however seemed correct in that regard.

Further the bits indicating the serdes lane speed are 0 in linux driver, and all one in vendor u-boot.

As atf ignores pci lane configuration when originating from kernel, this had no impact so far.

5. The port number passed from u-boot to atf appears to have no effect.
Fixing it in vendor u-boot had so far no apparent impact.


To conclude, I think my device-tree patch is not correct and should be replaced
once a better workaround or solution is discovered.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ