[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPDyKFooYFVrzLEqOtwb02iyEf+c6qPB8+Us1--Y-oXbJVG+SQ@mail.gmail.com>
Date: Tue, 15 Jul 2025 13:32:46 +0200
From: Ulf Hansson <ulf.hansson@...aro.org>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Marek Szyprowski <m.szyprowski@...sung.com>, Saravana Kannan <saravanak@...gle.com>,
Stephen Boyd <sboyd@...nel.org>, linux-pm@...r.kernel.org,
"Rafael J . Wysocki" <rafael@...nel.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Michael Grzeschik <m.grzeschik@...gutronix.de>, Bjorn Andersson <andersson@...nel.org>,
Abel Vesa <abel.vesa@...aro.org>, Peng Fan <peng.fan@....nxp.com>,
Tomi Valkeinen <tomi.valkeinen@...asonboard.com>, Johan Hovold <johan@...nel.org>,
Maulik Shah <maulik.shah@....qualcomm.com>, Michal Simek <michal.simek@....com>,
Konrad Dybcio <konradybcio@...nel.org>, Thierry Reding <thierry.reding@...il.com>,
Hiago De Franco <hiago.franco@...adex.com>, Geert Uytterhoeven <geert@...ux-m68k.org>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v3 21/24] pmdomain: core: Leave powered-on genpds on until late_initcall_sync
On Tue, 15 Jul 2025 at 12:28, Jon Hunter <jonathanh@...dia.com> wrote:
>
> Hi Ulf,
>
> On 10/07/2025 15:54, Ulf Hansson wrote:
> > On Thu, 10 Jul 2025 at 14:26, Marek Szyprowski <m.szyprowski@...sung.com> wrote:
> >>
> >> On 01.07.2025 13:47, Ulf Hansson wrote:
> >>> Powering-off a genpd that was on during boot, before all of its consumer
> >>> devices have been probed, is certainly prone to problems.
> >>>
> >>> As a step to improve this situation, let's prevent these genpds from being
> >>> powered-off until genpd_power_off_unused() gets called, which is a
> >>> late_initcall_sync().
> >>>
> >>> Note that, this still doesn't guarantee that all the consumer devices has
> >>> been probed before we allow to power-off the genpds. Yet, this should be a
> >>> step in the right direction.
> >>>
> >>> Suggested-by: Saravana Kannan <saravanak@...gle.com>
> >>> Tested-by: Hiago De Franco <hiago.franco@...adex.com> # Colibri iMX8X
> >>> Tested-by: Tomi Valkeinen <tomi.valkeinen@...asonboard.com> # TI AM62A,Xilinx ZynqMP ZCU106
> >>> Signed-off-by: Ulf Hansson <ulf.hansson@...aro.org>
> >>
> >> This change has a side effect on some Exynos based boards, which have
> >> display and bootloader is configured to setup a splash screen on it.
> >> Since today's linux-next, those boards fails to boot, because of the
> >> IOMMU page fault.
> >
> > Thanks for reporting, let's try to fix this as soon as possible then.
> >
> >>
> >> This happens because the display controller is enabled and configured to
> >> perform the scanout from the spash-screen buffer until the respective
> >> driver will reset it in driver probe() function. This however doesn't
> >> work with IOMMU, which is being probed earlier than the display
> >> controller driver, what in turn causes IOMMU page fault once the IOMMU
> >> driver gets attached. This worked before applying this patch, because
> >> the power domain of display controller was simply turned off early
> >> effectively reseting the display controller.
> >
> > I can certainly try to help to find a solution, but I believe I need
> > some more details of what is happening.
> >
> > Perhaps you can point me to some relevant DTS file to start with?
> >
> >>
> >> This has been discussed a bit recently:
> >> https://lore.kernel.org/all/544ad69cba52a9b87447e3ac1c7fa8c3@disroot.org/
> >> and I can add a workaround for this issue in the bootloaders of those
> >> boards, but this is something that has to be somehow addressed in a
> >> generic way.
> >
> > It kind of sounds like there is a missing power-domain not being
> > described in DT for the IOMMU, but I might have understood the whole
> > thing wrong.
> >
> > Let's see if we can work something out in the next few days, otherwise
> > we need to find another way to let some genpds for these platforms to
> > opt out from this new behaviour.
>
> Have you found any resolution for this? I have also noticed a boot
> regression on one of our Tegra210 boards and bisect is pointing to this
> commit. I don't see any particular crash, but a hang on boot.
Thanks for reporting!
For Exynos we opt-out from the behaviour by enforcing a sync_state of
all PM domains upfront [1], which means before any devices get
attached.
Even if that defeats the purpose of the $subject series, this was one
way forward that solved the problem. When the boot-ordering problem
(that's how I understood the issue) for Exynos gets resolved, we
should be able to drop the hack, at least that's the idea.
>
> If there is any debug we can enable to see which pmdomain is the problem
> let me know.
There aren't many debug prints in genpd that I think makes much sense
to enable, but you can always give it a try. Since you are hanging,
obviously you can't look at the genpd debugfs data...
Note that, the interesting PM domains are those that are powered-on
when calling pm_genpd_init(). As a start, I would add some debug
prints in () to see which PM domains that are relevant to track.
Potentially you could then try to power them off and register them
accordingly with genpd. One by one, to see which of them is causing
the problem.
Another option could be to add a new genpd config flag
(GENPD_FLAG_DONT_STAY_ON or something along those lines), that informs
genpd to not set the genpd->stay_on in pm_genpd_init(). Then
tegra_powergate_add() would have to set GENPD_FLAG_DONT_STAY_ON for
those genpds that really need it.
Kind regards
Uffe
[1]
https://lore.kernel.org/all/20250711114719.189441-1-ulf.hansson@linaro.org/
Powered by blists - more mailing lists