lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7f10c6c94729dfa48e18f7f4b038403a@walle.cc>
Date:   Sat, 09 Jan 2021 17:49:37 +0100
From:   Michael Walle <michael@...le.cc>
To:     Saravana Kannan <saravanak@...gle.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        stable <stable@...r.kernel.org>,
        Android Kernel Team <kernel-team@...roid.com>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3] driver core: Fix device link device name collision

Am 2021-01-08 18:22, schrieb Saravana Kannan:
> On Fri, Jan 8, 2021 at 12:16 AM Michael Walle <michael@...le.cc> wrote:
>> 
>> Am 2021-01-08 02:24, schrieb Saravana Kannan:
>> > The device link device's name was of the form:
>> > <supplier-dev-name>--<consumer-dev-name>
>> >
>> > This can cause name collision as reported here [1] as device names are
>> > not globally unique. Since device names have to be unique within the
>> > bus/class, add the bus/class name as a prefix to the device names used
>> > to
>> > construct the device link device name.
>> >
>> > So the devuce link device's name will be of the form:
>> > <supplier-bus-name>:<supplier-dev-name>--<consumer-bus-name>:<consumer-dev-name>
>> >
>> > [1] -
>> > https://lore.kernel.org/lkml/20201229033440.32142-1-michael@walle.cc/
>> >
>> > Cc: stable@...r.kernel.org
>> > Fixes: 287905e68dd2 ("driver core: Expose device link details in
>> > sysfs")
>> > Reported-by: Michael Walle <michael@...le.cc>
>> > Signed-off-by: Saravana Kannan <saravanak@...gle.com>
>> > ---
>> [..]
>> 
>> The changes are missing for the error path and
>> devlink_remove_symlinks(),
>> right?
> 
> Removing symlinks doesn't need the name. Just needs the "handle". So
> we are good there.

I don't get it. What is the "handle"? Without the patch below
kernfs_remove_by_name() in sysfs_remove_link will return -ENOENT. With
the patch it will return 0.

And even if it would work, how is this even logical:

	snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con), dev_name(con));
	ret = sysfs_create_link(&sup->kobj, &link->link_dev.kobj, buf);
	if (ret)
		goto err_con_dev;
	snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup), dev_name(sup));
	ret = sysfs_create_link(&con->kobj, &link->link_dev.kobj, buf);
	if (ret)
		goto err_sup_dev;
[..]
err_sup_dev:
	snprintf(buf, len, "consumer:%s", dev_name(con));
	sysfs_remove_link(&sup->kobj, buf);

You call sysfs_create_link("consumer:bus_name:dev_name") but the
corresponding rollback is sysfs_remove_link("consumer:dev_name"), that
is super confusing.

>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 4140a69dfe18..385e16d92874 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -485,7 +485,7 @@ static int devlink_add_symlinks(struct device 
>> *dev,
>>         goto out;
>> 
>>   err_sup_dev:
>> -       snprintf(buf, len, "consumer:%s", dev_name(con));
>> +       snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con),
>> dev_name(con));
>>         sysfs_remove_link(&sup->kobj, buf);
>>   err_con_dev:
>>         sysfs_remove_link(&link->link_dev.kobj, "consumer");
>> @@ -508,7 +508,9 @@ static void devlink_remove_symlinks(struct device
>> *dev,
>>         sysfs_remove_link(&link->link_dev.kobj, "consumer");
>>         sysfs_remove_link(&link->link_dev.kobj, "supplier");
>> 
>> -       len = max(strlen(dev_name(sup)), strlen(dev_name(con)));
>> +       len = max(strlen(dev_bus_name(sup)) + strlen(dev_name(sup)),
>> +                 strlen(dev_bus_name(con)) + strlen(dev_name(con)));
>> +       len += strlen(":");
>>         len += strlen("supplier:") + 1;
>>         buf = kzalloc(len, GFP_KERNEL);
>>         if (!buf) {
>> @@ -516,9 +518,9 @@ static void devlink_remove_symlinks(struct device
>> *dev,
>>                 return;
>>         }
>> 
>> -       snprintf(buf, len, "supplier:%s", dev_name(sup));
>> +       snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup),
>> dev_name(sup));
>>         sysfs_remove_link(&con->kobj, buf);
>> -       snprintf(buf, len, "consumer:%s", dev_name(con));
>> +       snprintf(buf, len, "consumer:%s:%s", dev_bus_name(sup),
>> dev_name(con));

btw this should be dev_bus_name(con).

>>         sysfs_remove_link(&sup->kobj, buf);
>>         kfree(buf);
>>   }
>> 
>> With these changes:
>> 
>> Tested-by: Michael Walle <michael@...le.cc>
> 
> Greg,
> 
> I think it's good to pick up this version if you don't see any issues.

Why so fast?


>> This at least make the warning go away.
> 
> Phew!
> 
>> BUT, there is somesthing strange or at least I don't get it:
>> 
>> # find /sys/bus/pci/devices/0000:00:00.0/ -name "consumer\:*"
>> # find /sys/bus/pci/devices/0000:00:00.1/ -name "consumer\:*"
>> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1:04
>> /sys/bus/pci/devices/0000:00:00.1/consumer:mdio_bus:0000:00:00.1
>> 
>> enetc0 (0000:00:00.0) has no consumers while enetc1 (0000:00:00.1)
>> has ones. Although both have PHYs connected. Here are the
>> corresonding device tree entries:
>> 
>> enetc0:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28.dts?h=v5.11-rc2#n81
>> 
>> enetc1:
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm64/boot/dts/freescale/fsl-ls1028a-kontron-sl28-var4.dts?h=v5.11-rc2#n21
>> 
>> Why is there a link between enetc1 and its PHY (and MDIO bus)
>> but not for enetc0?
> 
> So a lot of subtle things could be going on here that make it look
> like it's not working correctly but it's actually working fine. Since
> fw_devlink=permissive is the default mode, all links that are created
> are SYNC_STATE_ONLY links. These links are deleted after their
> consumers probe. So if you really want to see all the "real" links
> persist, try booting with fw_devlink=on. You might have boot issues
> though -- I'm working on that separately [1]. Also, SYNC_STATE_ONLY
> links can be created between the parent of a consumer and the supplier
> (I won't get into the why here) depending on some ordering -- so that
> might be causing some spurious looking links, but they aren't.
> 
> Another way to do what you are trying to do is to enable the logs in
> device_link_add() and look at them to see if all the links are created
> as you'd expect.
> 
>> btw, here are all links:
>> 
>> # ls /sys/class/devlink/
>> pci:0000:00:00.1--mdio_bus:0000:00:00.1
>> platform:5000000.iommu--pci:0000:00:00.0
>> platform:5000000.iommu--pci:0000:00:00.1
>> platform:5000000.iommu--pci:0000:00:00.2
>> platform:5000000.iommu--pci:0000:00:00.3
>> platform:5000000.iommu--pci:0000:00:00.5
>> platform:5000000.iommu--pci:0000:00:00.6
>> platform:5000000.iommu--pci:0001:00:00.0
>> platform:5000000.iommu--pci:0002:00:00.0
>> platform:5000000.iommu--platform:2140000.mmc
>> platform:5000000.iommu--platform:2150000.mmc
>> platform:5000000.iommu--platform:22c0000.dma-controller
>> platform:5000000.iommu--platform:3100000.usb
>> platform:5000000.iommu--platform:3110000.usb
>> platform:5000000.iommu--platform:3200000.sata
>> platform:5000000.iommu--platform:8000000.crypto
>> platform:5000000.iommu--platform:8380000.dma-controller
>> platform:5000000.iommu--platform:f080000.display
>> platform:f1f0000.clock-controller--platform:f080000.display
>> regulator:regulator.0--i2c:0-0050
>> regulator:regulator.0--i2c:1-0057
>> regulator:regulator.0--i2c:2-0050
>> regulator:regulator.0--platform:3200000.sata
> 
> As you can see, most of the links that fw_devlink created are gone.
> Because all the consumers probed. Any remaining ones you see here are
> non-SYNC_STATE_ONLY links created by the driver/frameworks or cases
> where consumers haven't probed. My guess is that only the first one is
> of this criteria and it doesn't hurt anything here. Try booting with
> fw_devlink=on and check this list. That'll give you a better idea.

Thanks for explaining. I'll try that.

-michael

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ