lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 28 Sep 2021 03:56:15 +0300
From:   Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To:     Saravana Kannan <saravanak@...gle.com>
Cc:     Rob Herring <robh+dt@...nel.org>,
        Frank Rowand <frowand.list@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        devicetree@...r.kernel.org, linux-kernel@...r.kernel.org,
        Stephen Boyd <sboyd@...nel.org>,
        linux-arm-msm <linux-arm-msm@...r.kernel.org>
Subject: Re: [PATCH] Revert "of: property: fw_devlink: Add support for
 remote-endpoint"

[Adding Stephen and linux-arm-msm to the CC list, missed on the patch Cc 
list]

On 28/09/2021 00:58, Saravana Kannan wrote:
> On Mon, Sep 27, 2021 at 1:48 PM Dmitry Baryshkov
> <dmitry.baryshkov@...aro.org> wrote:
>>
>> Since the commit f7514a663016 ("of: property: fw_devlink: Add support
>> for remote-endpoint") Linux kernel started parsing and adding devlinks
>> for the remote-endpoint properties. However this brings more harm than
>> good.
>>
>> For all the remote-endpoints in the graph two links are created. Thus
>> each and every remote-endpoint ends up in the cyclic graph (instead of
>> the original intent of catching a cycle of graph + non-graph link):
> 
> Yes, I'm well aware of this. I even called this out in the commit
> text. This creating of cycles and then catching and relaxing it is
> intentional.
> https://lore.kernel.org/all/20210330185056.1022008-1-saravanak@google.com/

What would be the reason two always create a cycle which gives no 
additional information? Maybe I'm just misunderstanding this piece of code.

Regarding your commit message. Even if there is a non-remote-endpoint 
dependency, it will be hidden by the remote-endpoint cycle.

And another consequence of remote-endpoint loops.

Consider this part part of dmesg. One warning is correct (real cyclic 
dependency). Others are remote-endpoint spam. Can you spot, which ones?

[    7.032225] platform 1d87000.phy: Fixing up cyclic dependency with 
1d84000.ufshc
[   21.760326] platform c440000.spmi:pmic@2:typec@...0: Fixing up cyclic 
dependency with c440000.spmi:pmic@2:pmic-tcpm
[   21.944849] platform c440000.spmi:pmic@2:pdphy@...0: Fixing up cyclic 
dependency with c440000.spmi:pmic@2:pmic-tcpm
[   23.541968] platform a600000.usb: Fixing up cyclic dependency with 
c440000.spmi:pmic@2:pmic-tcpm
[   30.354170] i2c 5-002b: Fixing up cyclic dependency with hdmi-out


>>
>> [    0.381057] OF: remote-endpoint linking /soc@...eniqup@...000/i2c@...000/hdmi-bridge@2b to /soc@...dss@...0000/dsi@...4000/ports/port@...ndpoint
>> [    0.394421] OF: remote-endpoint linking /soc@...eniqup@...000/i2c@...000/hdmi-bridge@2b to /hdmi-out/port/endpoint
>> [    0.407007] OF: remote-endpoint linking /soc@...hy@...9000 to /soc@...pmi@...0000/pmic@...mic-tcpm/connector/ports/port@...ndpoint@0
>> [    0.419648] OF: remote-endpoint linking /soc@...sb@...8800/usb@...0000 to /soc@...pmi@...0000/pmic@...mic-tcpm/ports/port@...ndpoint@0
>> [    0.432578] OF: remote-endpoint linking /soc@...ci@...f000/i2c-bus@...am1@c0 to /soc@...amss@...a000/ports/port@...ndpoint
>> [    0.444450] OF: remote-endpoint linking /soc@...amss@...a000 to /soc@...ci@...f000/i2c-bus@...am1@...port/endpoint
>> [    0.455292] OF: remote-endpoint linking /soc@...dss@...0000/mdp@...1000 to /soc@...dss@...0000/dsi@...4000/ports/port@...ndpoint
>> [    0.467210] OF: remote-endpoint linking /soc@...dss@...0000/mdp@...1000 to /soc@...dss@...0000/dsi@...6000/ports/port@...ndpoint
>> [    0.479239] OF: remote-endpoint linking /soc@...dss@...0000/dsi@...4000 to /soc@...dss@...0000/mdp@...1000/ports/port@...ndpoint
>> [    0.491147] OF: remote-endpoint linking /soc@...dss@...0000/dsi@...4000 to /soc@...eniqup@...000/i2c@...000/hdmi-bridge@...ports/port@...ndpoint
>> [    0.504979] OF: remote-endpoint linking /soc@...pmi@...0000/pmic@...ypec@...0 to /soc@...pmi@...0000/pmic@...mic-tcpm/ports/port@...ndpoint
>> [    0.517958] OF: remote-endpoint linking /soc@...pmi@...0000/pmic@...dphy@...0 to /soc@...pmi@...0000/pmic@...mic-tcpm/ports/port@...ndpoint
>> [    0.565326] OF: remote-endpoint linking /hdmi-out to /soc@...eniqup@...000/i2c@...000/hdmi-bridge@...ports/port@...ndpoint
>>
>> Under some conditions the device can become it's own supplier,
>> preventing this device to be probed at all:
> 
> I'm not sure this analysis is correct -- this shouldn't be happening.
> If you go to the device link folder and cat "sync_state_only", I
> expect it to be "1" in this case. Can you confirm that?

It is "1".

> Which means it won't block probing. Yes, the link itself is useless
> and it'll get auto deleted once mdss probes and it's easy to not
> create it in the first place. But this is definitely not your issue.
> 
>> $ ls -l /sys/bus/platform/devices/ae00000.mdss/
>> lrwxrwxrwx    1 root     root             0 Aug  4 15:13 consumer:platform:ae00000.mdss -> ../../../virtual/devlink/platform:ae00000.mdss--platform:ae00000.mdss
>>
>> I think that until of_link can be tought to handle bi-directional links
>> on its own, we should not parse remote-endpoint properties. Thus the
>> aforementioned commit should be reverted.
> 
> Nak. remote-endpoint parsing is working as intended. I don't think the
> analysis is correct.
> 
> Can you please enable the logs in all these functions and attach the
> log so we can see why it's not probing mdss?
> device_link_add
> device_links_check_suppliers
> func fw_devlink_relax_link
> fw_devlink_create_devlink

After doing the analysis, I can confirm that I was too quick regarding 
the mdss links preventing it from being probed. Sorry about that.

It all went up to the DP phy having a link with usb-c-connector. I was 
running the kernel 5.15-rc1, so your tcpm fix is already present. 
However my colleague has disabled the tcpm device (which I did not 
notice). So the driver did not call fw_devlink_purge_absent_suppliers().
The devlink still exists:

[   53.426446] platform 88e9000.phy: probe deferral - wait for supplier 
connector

However it is not present in the sysfs:

root@...m-armv8a:~# ls -l /sys/bus/platform/devices/88e9000.phy/
lrwxrwxrwx    1 root     root             0 Aug  4 15:13 
consumer:platform:a600000.usb -> 
../../../virtual/devlink/platform:88e9000.phy--platform:a600000.usb
lrwxrwxrwx    1 root     root             0 Aug  4 15:13 
consumer:platform:af00000.clock-controller -> 
../../../virtual/devlink/platform:88e9000.phy--platform:af00000.clock-controller
-rw-r--r--    1 root     root          4096 Aug  4 15:13 driver_override
-r--r--r--    1 root     root          4096 Aug  4 15:13 modalias
lrwxrwxrwx    1 root     root             0 Aug  4 15:13 of_node -> 
../../../../firmware/devicetree/base/soc@...hy@...9000
drwxr-xr-x    2 root     root             0 Aug  4 15:13 power
lrwxrwxrwx    1 root     root             0 Aug  4 15:10 subsystem -> 
../../../../bus/platform
lrwxrwxrwx    1 root     root             0 Aug  4 15:13 
supplier:platform:100000.clock-controller -> 
../../../virtual/devlink/platform:100000.clock-controller--platform:88e9000.phy
lrwxrwxrwx    1 root     root             0 Aug  4 15:13 
supplier:platform:18200000.rsc:clock-controller -> 
../../../virtual/devlink/platform:18200000.rsc:clock-controller--platform:88e9000.phy
lrwxrwxrwx    1 root     root             0 Aug  4 15:13 
supplier:platform:18200000.rsc:pm8150-rpmh-regulators -> 
../../../virtual/devlink/platform:18200000.rsc:pm8150-rpmh-regulators--platform:88e9000.phy
-rw-r--r--    1 root     root          4096 Aug  4 15:10 uevent
-r--r--r--    1 root     root          4096 Aug  4 15:13 
waiting_for_supplier

Thus it is not possible to spot this device link without 
CONFIG_DEBUG_DRIVER=y (or any similar debugging technique).

If I re-enabled tcpm device or if I reverted remote-endpoint parsing, DP 
PHY probing would go fine. The DP PHY does not really depend on the 
connector (or TCPM) being present in the system. The driver will 
continue working w/o it. However it does not have a change to declare that.

Furthermore I went back to the original case that caused you to add 
remote-endpoint support. The DSI-eDP bridge and eDP panel using the GPIO 
provided by that bridge. I think the proper fix for the original problem 
was implemented by the commit bf73537f411b ("drm/bridge: ti-sn65dsi86: 
Break GPIO and MIPI-to-eDP bridge into sub-drivers"). It split the 
DSI-eDP bridge driver into functional parts (devices), so that GPIO part 
and eDP parts are independent, thus breaking this cyclic dependency in a 
functional way. The remote-endpoint parsing is no longer necessary in 
this case (Stephen, please correct me if I'm wrong).


I still think that remote endpoint parsing does more harm and noise than 
good and thus should be reverted.

-- 
With best wishes
Dmitry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ