[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f68f8bcc-7543-5957-0e17-2cc797898ec0@samsung.com>
Date: Wed, 13 Jul 2022 08:52:57 +0200
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Saravana Kannan <saravanak@...gle.com>
Cc: Russell King <linux@...linux.org.uk>,
Philipp Zabel <p.zabel@...gutronix.de>,
Rob Herring <robh@...nel.org>,
Ulf Hansson <ulf.hansson@...aro.org>,
John Stultz <john.stultz@...aro.org>,
Linus Walleij <linus.walleij@...aro.org>,
Sudeep Holla <sudeep.holla@....com>,
Nicolas Saenz Julienne <nsaenzjulienne@...e.de>,
Geert Uytterhoeven <geert+renesas@...der.be>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
kernel-team@...roid.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] amba: Remove deferred device addition
Hi Saravana,
On 12.07.2022 21:38, Saravana Kannan wrote:
> On Tue, Jul 12, 2022 at 5:34 AM Marek Szyprowski
> <m.szyprowski@...sung.com> wrote:
>> On 12.07.2022 14:25, Marek Szyprowski wrote:
>>> On 05.07.2022 10:39, Saravana Kannan wrote:
>>>> The uevents generated for an amba device need PID and CID information
>>>> that's available only when the amba device is powered on, clocked and
>>>> out of reset. So, if those resources aren't available, the information
>>>> can't be read to generate the uevents. To workaround this requirement,
>>>> if the resources weren't available, the device addition was deferred and
>>>> retried periodically.
>>>>
>>>> However, this deferred addition retry isn't based on resources becoming
>>>> available. Instead, it's retried every 5 seconds and causes arbitrary
>>>> probe delays for amba devices and their consumers.
>>>>
>>>> Also, maintaining a separate deferred-probe like mechanism is
>>>> maintenance headache.
>>>>
>>>> With this commit, instead of deferring the device addition, we simply
>>>> defer the generation of uevents for the device and probing of the device
>>>> (because drivers needs PID and CID to match) until the PID and CID
>>>> information can be read. This allows us to delete all the amba specific
>>>> deferring code and also avoid the arbitrary probing delays.
>>>>
>>>> Cc: Rob Herring <robh@...nel.org>
>>>> Cc: Ulf Hansson <ulf.hansson@...aro.org>
>>>> Cc: John Stultz <john.stultz@...aro.org>
>>>> Cc: Saravana Kannan <saravanak@...gle.com>
>>>> Cc: Linus Walleij <linus.walleij@...aro.org>
>>>> Cc: Sudeep Holla <sudeep.holla@....com>
>>>> Cc: Nicolas Saenz Julienne <nsaenzjulienne@...e.de>
>>>> Cc: Geert Uytterhoeven <geert+renesas@...der.be>
>>>> Cc: Marek Szyprowski <m.szyprowski@...sung.com>
>>>> Cc: Kefeng Wang <wangkefeng.wang@...wei.com>
>>>> Cc: Russell King <linux@...linux.org.uk>
>>>> Signed-off-by: Saravana Kannan <saravanak@...gle.com>
>>>> ---
>>>>
>>>> v1 -> v2:
>>>> - Dropped RFC tag
>>>> - Complete rewrite to not use stub devices.
>>>>
>>>> v2 -> v3:
>>>> - Flipped the if() condition for hard-coded periphids.
>>>> - Added a stub driver to handle the case where all amba drivers are
>>>> modules loaded by uevents.
>>>> - Cc Marek after I realized I forgot to add him.
>>>>
>>>> v3 -> v4:
>>>> - Finally figured out and fixed the issue reported by Kefeng (bus match
>>>> can't return an error other than -EPROBE_DEFER).
>>>> - I tested the patch on "V2P-CA15" on qemu
>>>> - Marek tested v3, but that was so long ago and the rebase wasn't clean,
>>>> so I didn't include the tested-by.
>>>>
>>>> Marek/Kefeng,
>>>>
>>>> Mind giving a Tested-by?
>>>
>>> Yes, it looks that it still works fine.
>>>
>>> I've tested it by changing the Exynos power domain driver to
>>> initialize from late_initcall. This in turn lead me to a bug in
>>> generic pm_domains code in __genpd_dev_pm_attach(), which returns -2
>>> if the pm domain driver is not yet registered. After fixing that, I've
>>> successfully observed the deferred probe of PL330 driver on Exynos
>>> 4210 based boards both with this patch and without (with the old timer
>>> based code).
> Thanks for testing it again Marek! I was hoping you'll hit the crash
> that Sudeep was hitting and it would give me some more clues.
>
> Sudeep,
>
> This makes me think the issue you are seeing is related to your
> hardware drivers. Can you look into those please? I'm leaning towards
> merging this amba clean up and adding delays (say 1ms) to your
> clock/power domain drivers to avoid the crash you are seeing. And then
> you can figure out the actual delays needed and update it.
>
>> While preparing a fix for the above issue in genpd I found that it has
>> been introduced by your commit 5a46079a9645 ("PM: domains: Delete usage
>> of driver_deferred_probe_check_state()"). I didn't analyze it enough,
>> but it looks that something is missing there if we are trying to probe
>> amba device. I assume that returning -EPROBE_DEFER unconditionally there
>> is also not a valid solution?
> Yeah, the unconditionally returning -EPROBE_DEFER wouldn't work
> because if the supplier is optional but not present, the consumer
> driver would never stop waiting for it. I'm looking into issues
> similar to the one you saw in other threads [1]. The problem always
> boils down to the supplier device's DT node not having "compatible"
> property and therefore fw_devlink creating the device link between the
> consumer and the supplier's parent.
>
> Basically if the drivers/DT are implemented "properly", you would
> never get to the failure case (-2) if the driver is actually present.
Well, I don't get what do you mean by not having the proper 'comaptible'
property. Both affected devices (amba's pl330 and its power domain) have
compatible strings: 'arm,pl330' and 'samsung,exynos4210-pd', but the
devlinks doesn't help. Is it related to the custom device addition code
in the amba bus?
> I have some other ideas on how to get these to work better (not sure
> if it'll be for 100% of the cases), but until I get those ideas sorted
> out, I might do a partial revert of the change you mentioned.
>
> [1] - https://lore.kernel.org/lkml/4799738.LvFx2qVVIh@steina-w/
> ...
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists