lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Jan 2021 08:36:32 +0100
From:   Marek Szyprowski <m.szyprowski@...sung.com>
To:     Saravana Kannan <saravanak@...gle.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Android Kernel Team <kernel-team@...roid.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Jisheng Zhang <Jisheng.Zhang@...aptics.com>,
        Kevin Hilman <khilman@...libre.com>,
        John Stultz <john.stultz@...aro.org>,
        Nicolas Saenz Julienne <nsaenzjulienne@...e.de>,
        Marc Zyngier <maz@...nel.org>,
        Linux Samsung SOC <linux-samsung-soc@...r.kernel.org>,
        Krzysztof Kozlowski <krzk@...nel.org>,
        Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>
Subject: Re: [PATCH v1 5/5] driver core: Set fw_devlink=on by default

Hi Saravana,

On 13.01.2021 20:23, Saravana Kannan wrote:
> On Tue, Jan 12, 2021 at 11:04 PM Marek Szyprowski
> <m.szyprowski@...sung.com> wrote:
>> On 12.01.2021 21:51, Saravana Kannan wrote:
>>> On Mon, Jan 11, 2021 at 11:11 PM Marek Szyprowski
>>> <m.szyprowski@...sung.com> wrote:
>>>> On 11.01.2021 22:47, Saravana Kannan wrote:
>>>>> On Mon, Jan 11, 2021 at 6:18 AM Marek Szyprowski
>>>>> <m.szyprowski@...sung.com> wrote:
>>>>>> On 11.01.2021 12:12, Marek Szyprowski wrote:
>>>>>>> On 18.12.2020 04:17, Saravana Kannan wrote:
>>>>>>>> Cyclic dependencies in some firmware was one of the last remaining
>>>>>>>> reasons fw_devlink=on couldn't be set by default. Now that cyclic
>>>>>>>> dependencies don't block probing, set fw_devlink=on by default.
>>>>>>>>
>>>>>>>> Setting fw_devlink=on by default brings a bunch of benefits (currently,
>>>>>>>> only for systems with device tree firmware):
>>>>>>>> * Significantly cuts down deferred probes.
>>>>>>>> * Device probe is effectively attempted in graph order.
>>>>>>>> * Makes it much easier to load drivers as modules without having to
>>>>>>>>       worry about functional dependencies between modules (depmod is still
>>>>>>>>       needed for symbol dependencies).
>>>>>>>>
>>>>>>>> If this patch prevents some devices from probing, it's very likely due
>>>>>>>> to the system having one or more device drivers that "probe"/set up a
>>>>>>>> device (DT node with compatible property) without creating a struct
>>>>>>>> device for it.  If we hit such cases, the device drivers need to be
>>>>>>>> fixed so that they populate struct devices and probe them like normal
>>>>>>>> device drivers so that the driver core is aware of the devices and their
>>>>>>>> status. See [1] for an example of such a case.
>>>>>>>>
>>>>>>>> [1] -
>>>>>>>> https://protect2.fireeye.com/v1/url?k=68f5d8ba-376ee1f5-68f453f5-0cc47a30d446-324e64700545ab93&q=1&e=fb455b9e-c8c7-40d0-8e3c-d9d3713d519b&u=https%3A%2F%2Flore.kernel.org%2Flkml%2FCAGETcx9PiX%3D%3DmLxB9PO8Myyk6u2vhPVwTMsA5NkD-ywH5xhusw%40mail.gmail.com%2F
>>>>>>>> Signed-off-by: Saravana Kannan <saravanak@...gle.com>
>>>>>>> This patch landed recently in linux next-20210111 as commit
>>>>>>> e590474768f1 ("driver core: Set fw_devlink=on by default"). Sadly it
>>>>>>> breaks Exynos IOMMU operation, what causes lots of devices being
>>>>>>> deferred and not probed at all. I've briefly checked and noticed that
>>>>>>> exynos_sysmmu_probe() is never called after this patch. This is really
>>>>>>> strange for me, as the SYSMMU controllers on Exynos platform are
>>>>>>> regular platform devices registered by the OF code. The driver code is
>>>>>>> here: drivers/iommu/exynos-iommu.c, example dts:
>>>>>>> arch/arm/boot/dts/exynos3250.dtsi (compatible = "samsung,exynos-sysmmu").
>>>>>> Okay, I found the source of this problem. It is caused by Exynos power
>>>>>> domain driver, which is not platform driver yet. I will post a patch,
>>>>>> which converts it to the platform driver.
>>>>> Thanks Marek! Hopefully the debug logs I added were sufficient to
>>>>> figure out the reason.
>>>> Frankly, it took me a while to figure out that device core waits for the
>>>> power domain devices. Maybe it would be possible to add some more debug
>>>> messages or hints? Like the reason of the deferred probe in
>>>> /sys/kernel/debug/devices_deferred ?
>>> There's already a /sys/devices/.../<device>/waiting_for_supplier file
>>> that tells you if the device is waiting for a supplier device to be
>>> added. That file goes away once the device probes. If the file has 1,
>>> then it's waiting for the supplier device to be added (like your
>>> case). If it's 0, then the device is just waiting on one of the
>>> existing suppliers to probe. You can find the existing suppliers
>>> through /sys/devices/.../<device>/supplier:*/supplier. Also, flip
>>> these dev_dbg() to dev_info() if you need more details about deferred
>>> probing.
>> Frankly speaking I doubt that anyone will find those. Even experienced
>> developer might need some time to figure it out.
>>
>> I expect that such information will be at least in the mentioned
>> /sys/kernel/debug/devices_deferred file. We already have infrastructure
>> for putting the deferred probe reason there, see dev_err_probe()
>> function. Even such a simple change makes the debugging this issue much
>> easier:
>>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index cd8e518fadd6..ceb5aed5a84c 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -937,12 +937,13 @@ int device_links_check_suppliers(struct device *dev)
>>           mutex_lock(&fwnode_link_lock);
>>           if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) &&
>>               !fw_devlink_is_permissive()) {
>> -               dev_dbg(dev, "probe deferral - wait for supplier %pfwP\n",
>> +               ret = dev_err_probe(dev, -EPROBE_DEFER,
>> +                       "probe deferral - wait for supplier %pfwP\n",
>> list_first_entry(&dev->fwnode->suppliers,
>>                           struct fwnode_link,
>>                           c_hook)->supplier);
>>                   mutex_unlock(&fwnode_link_lock);
>> -               return -EPROBE_DEFER;
>> +               return ret;
>>           }
>>           mutex_unlock(&fwnode_link_lock);
>>
>> @@ -955,9 +956,9 @@ int device_links_check_suppliers(struct device *dev)
>>                   if (link->status != DL_STATE_AVAILABLE &&
>>                       !(link->flags & DL_FLAG_SYNC_STATE_ONLY)) {
>>                           device_links_missing_supplier(dev);
>> -                       dev_dbg(dev, "probe deferral - supplier %s not
>> ready\n",
>> +                       ret = dev_err_probe(dev, -EPROBE_DEFER,
>> +                               "probe deferral - supplier %s not ready\n",
>>                                   dev_name(link->supplier));
>> -                       ret = -EPROBE_DEFER;
>>                           break;
>>                   }
>>                   WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE);
>>
>>
>> After such change:
>>
>> # cat /sys/kernet/debug/devices_deferred
> Sweet! I wasn't aware of this file at all.
>
> However, on a side note, one of my TODO items is to not add devices to
> the deferred probe list if they'll never probe yet (due to suppliers
> not having probed). On a board I tested on, it cut down really_probe()
> calls by 75%! So the probe attempt itself effectively happens in graph
> order (which I think is pretty cool). So that's going to conflict with
> this file. I'll have to see what to do about that.
>
> Thanks for this pointer. Let me sit on this for 2 weeks and see how I
> can incorporate your suggestion while allowing for the above. And then
> I'll send out a patch. Does that work?

Fine for me.

Even if you want to change the core not to probe devices that miss their 
suppliers (what's good imho), the 'devices_deferred' file might still 
contain all of them. For user it is just a list of devices that are not 
yet available in the system with the optional reasons for that.

Best regards

-- 
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Powered by blists - more mailing lists