lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aC-AqDa8cjq2AYeM@linaro.org>
Date: Thu, 22 May 2025 21:53:12 +0200
From: Stephan Gerhold <stephan.gerhold@...aro.org>
To: Krzysztof Kozlowski <krzk@...nel.org>,
	Saravana Kannan <saravanak@...gle.com>
Cc: Rob Herring <robh@...nel.org>, Bjorn Andersson <andersson@...nel.org>,
	Jassi Brar <jassisinghbrar@...il.com>,
	Krzysztof Kozlowski <krzk+dt@...nel.org>,
	Conor Dooley <conor+dt@...nel.org>,
	Michael Turquette <mturquette@...libre.com>,
	Stephen Boyd <sboyd@...nel.org>, linux-arm-msm@...r.kernel.org,
	linux-kernel@...r.kernel.org, devicetree@...r.kernel.org,
	linux-clk@...r.kernel.org, Georgi Djakov <djakov@...nel.org>,
	Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
Subject: Re: [PATCH 1/4] dt-bindings: mailbox: qcom,apcs: Add separate node
 for clock-controller

+Saravana

On Wed, May 21, 2025 at 11:20:40AM +0200, Krzysztof Kozlowski wrote:
> On Wed, May 14, 2025 at 10:12:44PM GMT, Stephan Gerhold wrote:
> > > > > > The mailbox itself does not need any clocks and should probe early to
> 
> ... so probe it early.
> 
> > > > > > unblock the rest of the boot process. The "clocks" are only needed for the
> > > > > > separate clock controller. In Linux, these are already two separate drivers
> > > > > > that can probe independently.
> 
> They can probe later, no problem and DT does not stop that. Linux, not
> DT, controls the ways of probing of devices and their children.
> 
> > > > > > 
> > > > > 
> > > > > Why does this circular dependency need to be broken in the DeviceTree
> > > > > representation?
> > > > > 
> > > > > As you describe, the mailbox probes and register the mailbox controller
> > > > > and it registers the clock controller. The mailbox device isn't affected
> > > > > by the clock controller failing to find rpmcc...
> > > > > 
> > > > 
> > > > That's right, but the problem is that the probe() function of the
> > > > mailbox driver won't be called at all. The device tree *looks* like the
> > > > mailbox depends on the clock, so fw_devlink tries to defer probing until
> > > > the clock is probed (which won't ever happen, because the mailbox is
> > > > needed to make the clock available).
> > > > 
> > > > I'm not sure why fw_devlink doesn't detect this cycle and tries to probe
> > > > them anyway, but fact is that we need to split this up in order to avoid
> > > > warnings and have the supplies/consumers set up properly. Those device
> > > > links are created based on the device tree and not the drivers.
> > > 
> > > Does "post-init-providers" providers solve your problem?
> > > 
> > 
> > I would expect that it does, but it feels like the wrong solution to the
> > problem to me. The clock is not really a post-init provider: It's not
> > consumed at all by the mailbox and needed immediately to initialize the
> > clock controller. The real problem in my opinion is that we're
> > describing two essentially distinct devices/drivers in a single device
> > node, and there is no way to distinguish that.
> > 
> > By splitting up the two distinct components into separate device tree
> > nodes, the relation between the providers/consumers is clearly
> > described.
> 
> You can split devices without splitting the nodes. I do not see reason
> why the DT is the problem here.
> 

The Linux drivers for this particular mailbox/clock controller already
work exactly the way you propose. They are split into two devices that
can probe independently.

The problem is outside of the drivers, because fw_devlink in Linux
blocks probing until all resources specified in the device tree nodes
become available. fw_devlink has no knowledge that the mailbox described
by this peculiar device tree node does not actually need the clocks:

	apcs1_mbox: mailbox@...1000 {
		compatible = "qcom,msm8939-apcs-kpss-global", "syscon";
		reg = <0x0b011000 0x1000>;
		#mbox-cells = <1>;
		clocks = <&a53pll_c1>, <&gcc GPLL0_VOTE>, <&rpmcc RPM_SMD_XO_CLK_SRC>;
		clock-names = "pll", "aux", "ref";
		#clock-cells = <0>;
	};

Without device-specific quirks in fw_devlink, the fact that these clocks
are only used by an unrelated clock controller only becomes clear if we
split the device tree node like I propose in this series:

	apcs1_mbox: mailbox@...1000 {
		compatible = "qcom,msm8939-apcs-kpss-global", "syscon";
		reg = <0x0b011000 0x1000>;
		#mbox-cells = <1>;

		apcs1_clk: clock-controller {
			clocks = <&a53pll_c1>, <&gcc GPLL0_VOTE>, <&rpmcc RPM_SMD_XO_CLK_SRC>;
			clock-names = "pll", "aux", "ref";
			#clock-cells = <0>;
		};
	};

It is easy to say that the problem is in Linux (and not the DT), but
unless you are suggesting to remove fw_devlink from Linux, or to add
more device-specific quirks to the generic fw_devlink code, I'm only
aware of the following two options to make this work (both already
discussed in this email thread):

 1. post-init-providers (as suggested by Rob):

		post-init-providers = <&a53pll_c1>, <&gcc>, <&rpmcc>;

    To repeat my previous email: IMHO this is a crude workaround for
    this situation. The clock is not really a post-init provider: It's
    not consumed at all by the mailbox and needed immediately to
    initialize the clock controller.

    With this approach, there are no device links created for the
    clocks, so we don't get the proper probe/suspend ordering that
    fw_devlink normally provides.

 2. Split up device tree node (this patch series): With this approach,
    the mailbox can probe early and the clock controller child device
    gets the expected consumer/supplier device links to the clocks. IMHO
    this is the cleanest solution to go for.

@Saravana: Is there any other option that I missed? Or perhaps you have
any other suggestions how we should handle this?

To summarize the series and previous emails, the dependency cycle that
was in msm8939.dtsi before commit d92e9ea2f0f9 ("arm64: dts: qcom:
msm8939: revert use of APCS mbox for RPM") is:

  1. The clock controller inside &apcs1_mbox needs
     clocks = <&rpmcc RPM_SMD_XO_CLK_SRC>.
  2. &rpmcc is a child of remoteproc &rpm
  3. remoteproc &rpm needs mboxes = <&apcs1_mbox 8>;

This is not a real dependency cycle, the clocks in the mailbox@ node are
not needed for the mailbox. They are only used and needed for the clock
controller child device that makes use of the same device tree node.

At runtime this cycle currently results in none of the devices probing:

[   13.281637] platform remoteproc: deferred probe pending: qcom-rpm-proc: Failed to register smd-edge
[   13.296257] platform b011000.mailbox: deferred probe pending: platform: supplier b016000.clock not ready
[   13.308397] platform b016000.clock: deferred probe pending: platform: wait for supplier /remoteproc/smd-edge/rpm-requests/clock-controller

Thanks,
Stephan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ