[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7f10c6c94729dfa48e18f7f4b038403a@walle.cc>
Date: Sat, 09 Jan 2021 17:49:37 +0100
From: Michael Walle <michael@...le.cc>
To: Saravana Kannan <saravanak@...gle.com>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
stable <stable@...r.kernel.org>,
Android Kernel Team <kernel-team@...roid.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] driver core: Fix device link device name collision
Am 2021-01-08 18:22, schrieb Saravana Kannan:
> On Fri, Jan 8, 2021 at 12:16 AM Michael Walle <michael@...le.cc> wrote:
>>
>> Am 2021-01-08 02:24, schrieb Saravana Kannan:
>> > The device link device's name was of the form:
>> > <supplier-dev-name>--<consumer-dev-name>
>> >
>> > This can cause name collision as reported here [1] as device names are
>> > not globally unique. Since device names have to be unique within the
>> > bus/class, add the bus/class name as a prefix to the device names used
>> > to
>> > construct the device link device name.
>> >
>> > So the devuce link device's name will be of the form:
>> > <supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>
>> >
>> > [1] -
>> > https://lore.kernel.org/lkml/20201229033440.32142-1-michael@walle.cc/
>> >
>> > Cc: stable@...r.kernel.org
>> > Fixes: 287905e68dd2 ("driver core: Expose device link details in
>> > sysfs")
>> > Reported-by: Michael Walle <michael@...le.cc>
>> > Signed-off-by: Saravana Kannan <saravanak@...gle.com>
>> > ---
>> [..]
>>
>> The changes are missing for the error path and
>> devlink_remove_symlinks(),
>> right?
>
> Removing symlinks doesn't need the name. Just needs the "handle". So
> we are good there.
I don't get it. What is the "handle"? Without the patch below
kernfs_remove_by_name() in sysfs_remove_link will return -ENOENT. With
the patch it will return 0.
And even if it would work, how is this even logical:
snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con), dev_name(con));
ret = sysfs_create_link(&sup->kobj, &link->link_dev.kobj, buf);
if (ret)
goto err_con_dev;
snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup), dev_name(sup));
ret = sysfs_create_link(&con->kobj, &link->link_dev.kobj, buf);
if (ret)
goto err_sup_dev;
[..]
err_sup_dev:
snprintf(buf, len, "consumer:%s", dev_name(con));
sysfs_remove_link(&sup->kobj, buf);
You call sysfs_create_link("consumer:bus_name:dev_name") but the
corresponding rollback is sysfs_remove_link("consumer:dev_name"), that
is super confusing.
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 4140a69dfe18..385e16d92874 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -485,7 +485,7 @@ static int devlink_add_symlinks(struct device
>> *dev,
>> goto out;
>>
>> err_sup_dev:
>> - snprintf(buf, len, "consumer:%s", dev_name(con));
>> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con),
>> dev_name(con));
>> sysfs_remove_link(&sup->kobj, buf);
>> err_con_dev:
>> sysfs_remove_link(&link->link_dev.kobj, "consumer");
>> @@ -508,7 +508,9 @@ static void devlink_remove_symlinks(struct device
>> *dev,
>> sysfs_remove_link(&link->link_dev.kobj, "consumer");
>> sysfs_remove_link(&link->link_dev.kobj, "supplier");
>>
>> - len = max(strlen(dev_name(sup)), strlen(dev_name(con)));
>> + len = max(strlen(dev_bus_name(sup)) + strlen(dev_name(sup)),
>> + strlen(dev_bus_name(con)) + strlen(dev_name(con)));
>> + len += strlen(":");
>> len += strlen("supplier:") + 1;
>> buf = kzalloc(len, GFP_KERNEL);
>> if (!buf) {
>> @@ -516,9 +518,9 @@ static void devlink_remove_symlinks(struct device
>> *dev,
>> return;
>> }
>>
>> - snprintf(buf, len, "supplier:%s", dev_name(sup));
>> + snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup),
>> dev_name(sup));
>> sysfs_remove_link(&con->kobj, buf);
>> - snprintf(buf, len, "consumer:%s", dev_name(con));
>> + snprintf(buf, len, "consumer:%s:%s", dev_bus_name(sup),
>> dev_name(con));
btw this should be dev_bus_name(con).
>> sysfs_remove_link(&sup->kobj, buf);
>> kfree(buf);
>> }
>>
>> With these changes:
>>
>> Tested-by: Michael Walle <michael@...le.cc>
>
> Greg,
>
> I think it's good to pick up this version if you don't see any issues.
Why so fast?
>> This at least make the warning go away.
>
> Phew!
>
>> BUT, there is somesthing strange or at least I don't get it:
>>
>> # find /sys/bus/pci/devices/0000:00:00.0/ -name "consumer\:*"
>> # find /sys/bus/pci/devices/0000:00:00.1/ -name "consumer\:*"
>> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1:04
>> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1
>>
>> enetc0 (0000:00:00.0) has no consumers while enetc1 (0000:00:00.1)
>> has ones. Although both have PHYs connected. Here are the
>> corresonding device tree entries:
>>
>> enetc0:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts?h=v5.11-rc2#n81
>>
>> enetc1:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts?h=v5.11-rc2#n21
>>
>> Why is there a link between enetc1 and its PHY (and MDIO bus)
>> but not for enetc0?
>
> So a lot of subtle things could be going on here that make it look
> like it's not working correctly but it's actually working fine. Since
> fw_devlink=permissive is the default mode, all links that are created
> are SYNC_STATE_ONLY links. These links are deleted after their
> consumers probe. So if you really want to see all the "real" links
> persist, try booting with fw_devlink=on. You might have boot issues
> though -- I'm working on that separately [1]. Also, SYNC_STATE_ONLY
> links can be created between the parent of a consumer and the supplier
> (I won't get into the why here) depending on some ordering -- so that
> might be causing some spurious looking links, but they aren't.
>
> Another way to do what you are trying to do is to enable the logs in
> device_link_add() and look at them to see if all the links are created
> as you'd expect.
>
>> btw, here are all links:
>>
>> # ls /sys/class/devlink/
>> pci:0000:00:00.1--mdio_bus:0000:00:00.1
>> platform:5000000.iommu--pci:0000:00:00.0
>> platform:5000000.iommu--pci:0000:00:00.1
>> platform:5000000.iommu--pci:0000:00:00.2
>> platform:5000000.iommu--pci:0000:00:00.3
>> platform:5000000.iommu--pci:0000:00:00.5
>> platform:5000000.iommu--pci:0000:00:00.6
>> platform:5000000.iommu--pci:0001:00:00.0
>> platform:5000000.iommu--pci:0002:00:00.0
>> platform:5000000.iommu--platform:2140000.mmc
>> platform:5000000.iommu--platform:2150000.mmc
>> platform:5000000.iommu--platform:22c0000.dma-controller
>> platform:5000000.iommu--platform:3100000.usb
>> platform:5000000.iommu--platform:3110000.usb
>> platform:5000000.iommu--platform:3200000.sata
>> platform:5000000.iommu--platform:8000000.crypto
>> platform:5000000.iommu--platform:8380000.dma-controller
>> platform:5000000.iommu--platform:f080000.display
>> platform:f1f0000.clock-controller--platform:f080000.display
>> regulator:regulator.0--i2c:0-0050
>> regulator:regulator.0--i2c:1-0057
>> regulator:regulator.0--i2c:2-0050
>> regulator:regulator.0--platform:3200000.sata
>
> As you can see, most of the links that fw_devlink created are gone.
> Because all the consumers probed. Any remaining ones you see here are
> non-SYNC_STATE_ONLY links created by the driver/frameworks or cases
> where consumers haven't probed. My guess is that only the first one is
> of this criteria and it doesn't hurt anything here. Try booting with
> fw_devlink=on and check this list. That'll give you a better idea.
Thanks for explaining. I'll try that.
-michael
Powered by blists - more mailing lists