lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx9dTokpRGwZrE2t0LfVOOFxGn83O60KOnt7D+D2-YMx+g@mail.gmail.com>
Date: Tue, 22 Oct 2024 09:07:50 -0700
From: Saravana Kannan <saravanak@...gle.com>
To: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>, "Rafael J. Wysocki" <rafael@...nel.org>, 
	Linux Kernel List <linux-kernel@...r.kernel.org>, Aradhya Bhatia <aradhya.bhatia@...ux.dev>, 
	"dri-devel@...ts.freedesktop.org" <dri-devel@...ts.freedesktop.org>, Devarsh Thakkar <devarsht@...com>
Subject: Re: fw_devlinks preventing a panel driver from probing

On Tue, Oct 22, 2024 at 12:51 AM Tomi Valkeinen
<tomi.valkeinen@...asonboard.com> wrote:
>
> Hi,
>
> On 22/10/2024 02:29, Saravana Kannan wrote:
> > Hi Tomi,
> >
> > Sorry it took a while to get back.
> >
> > On Mon, Sep 16, 2024 at 4:52 AM Tomi Valkeinen
> > <tomi.valkeinen@...asonboard.com> wrote:
> >>
> >> Hi,
> >>
> >> We have an issue where two devices have dependencies to each other,
> >> according to drivers/base/core.c's fw_devlinks, and this prevents them
> >> from probing. I've been adding debugging to the core.c, but so far I
> >> don't quite grasp the issue, so I thought to ask. Maybe someone can
> >> instantly say that this just won't work...
> >>
> >> So, we have two devices, DSS (display subsystem) and an LVDS panel. The
> >> DSS normally outputs parallel video from its video ports (VP), but it
> >> has an integrated LVDS block (OLDI, Open LVDS Display Interface). The
> >> OLDI block takes input from DSS's parallel outputs. The OLDI is not
> >> modeled as a separate device (neither in the DT nor in the Linux device
> >> model) as it has no register space, and is controlled fully by the DSS.
> >>
> >> To support dual-link LVDS, the DSS has two OLDI instances. They both
> >> take their input from the same parallel video port, but each OLDI sends
> >> alternate lines forward. So for a dual-link setup the connections would
> >> be like this:
> >>
> >> +-----+-----+         +-------+         +----------+
> >> |     |     |         |       |         |          |
> >> |     | VP1 +----+--->| OLDI0 +-------->|          |
> >> |     |     |    |    |       |         |          |
> >> | DSS +-----+    |    +-------+         |  Panel   |
> >> |     |     |    |    |       |         |          |
> >> |     | VP2 |    +--->| OLDI1 +-------->|          |
> >> |     |     |         |       |         |          |
> >> +-----+-----+         +-------+         +----------+
> >>
> >> As the OLDI is not a separate device, it also does not have an
> >> independent device tree node, but rather it's inside DSS's node. The DSS
> >> parallel outputs are under a normal "ports" node, but OLDI ports are
> >> under "oldi-txes/ports" (see below for dts to clarify this).
> >>
> >> And I think (guess...) this is the root of the issue we're seeing, as it
> >> means the following, one or both of which might be the reason for this
> >> issue:
> >>
> >> - OLDI fwnodes don't have an associated struct device *. I think the
> >> reason is that the OLDI media graph ports are one level too deep in the
> >> hierarchy. So while the DSS ports are associated with the DSS device,
> >> OLDI ports are not.
> >
> > This is the root cause of the issue in some sense. fw_devlink doesn't
> > know that DSS depends on the VP. In the current DT, it only appears as
> > if the OLDI depends on VP. See further below for the fix.
> >
> >>
> >> - The VP ports inside the DSS point to OLDI ports, which are also inside
> >> DSS. So ports from a device point to ports in the same device (and back).
> >>
> >> If I understand the fw_devlink code correctly, in a normal case the
> >> links formed with media graphs are marked as a cycle
> >> (FWLINK_FLAG_CYCLE), and then ignored as far as probing goes.
> >>
> >> What we see here is that when using a single-link OLDI panel, the panel
> >> driver's probe never gets called, as it depends on the OLDI, and the
> >> link between the panel and the OLDI is not a cycle.
> >>
> >> The DSS driver probes, but the probe fails as it requires all the panel
> >> devices to have been probed (and thus registered to the DRM framework)
> >> before it can finish its setup.
> >>
> >> With dual-link, probing does happen and the drivers work. But I believe
> >> this is essentially an accident, in the sense that the first link
> >> between the panel and the OLDI still blocks the probing, but the second
> >> links allows the driver core to traverse the devlinks further, causing
> >> it to mark the links to the panel as FWLINK_FLAG_CYCLE (or maybe it only
> >> marks one of those links, and that's enough).
> >>
> >> If I set fw_devlink=off as a kernel parameter, the probing proceeds
> >> successfully in both single- and dual-link cases.
> >>
> >> Now, my questions is, is this a bug in the driver core, a bug in the DT
> >> bindings, or something in between (DT is fine-ish, but the structure is
> >> something that won't be supported by the driver core).
> >>
> >> And a follow-up question, regardless of the answer to the first one:
> >> which direction should I go from here =).
> >>
> >> The device tree data (simplified) for this is as follows, first the
> >> dual-link case, then the single-link case:
> >>
> >> /* Dual-link */
> >>
> >> dss: dss@...00000 {
> >>          compatible = "ti,am625-dss";
> >>
> >>          oldi-txes {
> >>                  oldi0: oldi@0 {
> >>                          oldi0_ports: ports {
> >>                                  port@0 {
> >>                                          oldi_0_in: endpoint {
> >>                                                  remote-endpoint = <&dpi0_out0>;
> >>                                          };
> >>                                  };
> >>
> >>                                  port@1 {
> >>                                          oldi_0_out: endpoint {
> >>                                                  remote-endpoint = <&lcd_in0>;
> >>                                          };
> >>                                  };
> >>                          };
> >>                  };
> >>
> >>                  oldi1: oldi@1 {
> >>                          oldi1_ports: ports {
> >>                                  port@0 {
> >>                                          oldi_1_in: endpoint {
> >>                                                  remote-endpoint = <&dpi0_out1>;
> >>                                          };
> >>                                  };
> >>
> >>                                  port@1 {
> >>                                          oldi_1_out: endpoint {
> >>                                                  remote-endpoint = <&lcd_in1>;
> >>                                          };
> >>                                  };
> >>                          };
> >>                  };
> >>          };
> >>
> >>          dss_ports: ports {
> >>                  port@0 {
> >>                          dpi0_out0: endpoint@0 {
> >>                                  remote-endpoint = <&oldi_0_in>;
> >>                          };
> >>                          dpi0_out1: endpoint@1 {
> >>                                  remote-endpoint = <&oldi_1_in>;
> >>                          };
> >>                  };
> >>          };
> >> };
> >>
> >> display {
> >>          compatible = "microtips,mf-101hiebcaf0", "panel-simple";
> >
> > In here, add this new property that I added some time back.
> >
> > post-init-providers = <&oldi-txes>;
>
> Thanks! This helps:
>
> post-init-providers = <&oldi0>;
>
> or for dual-link:
>
> post-init-providers = <&oldi0>, <&oldi1>;
>
> > This tells fw_devlink that VP doesn't depend on this node for
> > initialization/probing. This property is basically available to break
> > cycles in DT and mark one of the edges of the cycles as "not a real
> > init dependency".
> >
> > You should do the same for the single link case too.
>
> While this helps, it's not very nice... Every new DT overlay that uses
> OLDI display needs to have these.

Actually, taking a closer look at the DT and assuming I am visualizing
it correctly in my head, fw_devlink should notice the cycle between
oldi-txes and display and shouldn't block display from probing. Can
you check the cycle detection code and see where it's bailing out
early and not marking the fwnode link with the CYCLE flag?

__fw_devlink_relax_cycles() is where you want to look. There are a
bunch of debug log messages inside it and around where it's called
from.

Also, can you check debugfs/devices_deferred, or the
"wait_for_supplier" file under /sys/devices/..../display/ to make sure
it's actually stuck on oldi-txes? Just to make cure it's not some
other corner case that's triggered by oldi-txes?

> I'm still confused about why this is needed. OF graphs are _always_
> two-way links. Doesn't that mean that OF graphs never can be used for
> dependencies, as they go both ways?

Good question :) Yes, they'll always be ignored as cycles. But with
post-init-providers, it's actually better to add it so that cycles are
broken and better ordering is enforced. See my talk at LPC referee
talk about fw_devlink to see all the benefits you get from it. :)

> If so, shouldn't we just always
> ignore all OF graphs for dependency checking?

There are cases when two devices A and B have remote-endpoints between
them and ALSO have for example a gpio dependency between them. Where
the gpio is the "post-init-supplier". If we don't parse
remote-endpoint and mark the cycles, cases like these don't get to
probe.

-Saravana

>
>   Tomi
>
> >
> > Hope that helps. Let me know.
> >
> > Thanks,
> > Saravana
> >
> >>
> >>          ports {
> >>                  port@0 {
> >>                          lcd_in0: endpoint {
> >>                                  remote-endpoint = <&oldi_0_out>;
> >>                          };
> >>                  };
> >>
> >>                  port@1 {
> >>                          lcd_in1: endpoint {
> >>                                  remote-endpoint = <&oldi_1_out>;
> >>                          };
> >>                  };
> >>          };
> >> };
> >>
> >>
> >> /* Single-link */
> >>
> >> dss: dss@...00000 {
> >>          compatible = "ti,am625-dss";
> >>
> >>          oldi-txes {
> >>                  oldi0: oldi@0 {
> >>                          oldi0_ports: ports {
> >>                                  port@0 {
> >>                                          oldi_0_in: endpoint {
> >>                                                  remote-endpoint = <&dpi0_out0>;
> >>                                          };
> >>                                  };
> >>
> >>                                  port@1 {
> >>                                          oldi_0_out: endpoint {
> >>                                                  remote-endpoint = <&lcd_in0>;
> >>                                          };
> >>                                  };
> >>                          };
> >>                  };
> >>          };
> >>
> >>          dss_ports: ports {
> >>                  port@0 {
> >>                          dpi0_out0: endpoint@0 {
> >>                                  remote-endpoint = <&oldi_0_in>;
> >>                          };
> >>                  };
> >>          };
> >> };
> >>
> >> display {
> >>          compatible = "microtips,mf-101hiebcaf0", "panel-simple";
> >>
> >>          ports {
> >>                  port@0 {
> >>                          lcd_in0: endpoint {
> >>                                  remote-endpoint = <&oldi_0_out>;
> >>                          };
> >>                  };
> >>          };
> >> };
> >>
> >>    Tomi
> >>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ