[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251007-charming-successful-foxhound-1ca192@penduick>
Date: Tue, 7 Oct 2025 17:09:46 +0200
From: Maxime Ripard <mripard@...nel.org>
To: Luca Ceresoli <luca.ceresoli@...tlin.com>
Cc: Andrzej Hajda <andrzej.hajda@...el.com>,
Neil Armstrong <neil.armstrong@...aro.org>, Robert Foss <rfoss@...nel.org>,
Laurent Pinchart <Laurent.pinchart@...asonboard.com>, Jonas Karlman <jonas@...boo.se>,
Jernej Skrabec <jernej.skrabec@...il.com>, Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Thomas Zimmermann <tzimmermann@...e.de>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Hui Pu <Hui.Pu@...ealthcare.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>, dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org,
Dmitry Baryshkov <lumag@...nel.org>
Subject: Re: [PATCH 2/2] drm/bridge: ti-sn65dsi83: protect device resources
on unplug
On Mon, Sep 15, 2025 at 04:51:56PM +0200, Luca Ceresoli wrote:
> Hi Maxime,
>
> thanks for the feedback, this discussion is getting very interesting!
>
> On Mon, 15 Sep 2025 14:03:17 +0200
> Maxime Ripard <mripard@...nel.org> wrote:
>
> > > > I'm still confused why it's so important than in your example
> > > > xyz_disable must be called after drm_bridge_unplug.
> > >
> > > Let me clarify with an example.
> > >
> > > As I wrote in another reply, I have moved from a flag
> > > (disable_resources_needed) to a devres action as you had suggested, but
> > > the example here is based on the old flag because it is more explicit,
> > > code would be executed in the same order anyway, and, well, because I
> > > had written the example before the devres action conversion.
> > >
> > > Take these two functions (stripped versions of the actual ones):
> > >
> > > /* Same as proposed, but with _unplug moved at the end */
> > > static void sn65dsi83_remove()
> > > {
> > > struct sn65dsi83 *ctx = i2c_get_clientdata(client);
> > >
> > > drm_bridge_remove(&ctx->bridge);
> > >
> > > /*
> > > * I moved the following code to a devm action, but keeping it
> > > * explicit here for the discussion
> > > */
> > > if (ctx->disable_resources_needed) {
> > > sn65dsi83_monitor_stop(ctx);
> > > regulator_disable(ctx->vcc);
> > > }
> > >
> > > drm_bridge_unplug(&ctx->bridge); // At the end!
> > > }
> >
> > First off, why do we need to have drm_bridge_unplug and
> > drm_bridge_remove separate?
> >
> > If we were to mirror drm_dev_enter and drm_dev_unplug, drm_dev_unplug
> > calls drm_dev_unregister itself, and I can't find a reason where we
> > might want to split the two.
>
> I think it could make sense and I'm definitely open to it.
>
> After a quick analysis I have mostly one concern. Calls
> to drm_bridge_add() and drm_bridge_remove() are balanced in current
> code and that's very intuitive. If drm_bridge_unplug() were to call
> drm_bridge_remove(), that symmetry would disappear. Some drivers would
> still need to call drm_bridge_remove() directly (e.g. the DSI host
> drivers which _add/remove() in the DSI attach/detach callbacks), while
> other wouldn't because drm_bridge_unplug() would do that.
>
> What do you think about this?
Which DSI host do you have in mind there? Because it's really not what
we document.
> Another concern I initially had is about drivers whose usage of
> drm_bridge is more complex than the average. Most simple drivers just
> call drm_bridge_remove() in their .remove callback and that's
> straightforward. I was suspicious about drivers such as
> imx8qxp-pixel-combiner which instantiate multiple bridges, and whether
> they need do all the drm_bridge_unplug()s before all the
> drm_bridge_remove()s. However I don't think that's a real need because,
> except for probe and removal, operations on bridges happen on a
> per-bridge basis, so each bridge is independent from others, at least
> for the driver I mentioned.
In this particular case, they would be unplugged all at the same time,
right? In which case, we would disable all the bridges starting from the
one in the chain that just got removed, and then we just have to remove
all of them.
All in all, I think it's ok to somewhat break things here: all this was
broken before. If we want to bring some consistency, we will have to
reduce what bridges are allowed to do. Let's figure out something that
works for all reasonable cases (straightforward, component framework,
DSI device, DSI host, and DSI device on another bus), and the hacky
drivers will move eventually.
That's pretty easy to solve with a documentation update :)
We can just further restrict the order in which
> > > static void sn65dsi83_atomic_disable()
> > > {
> > > if (!drm_bridge_enter(bridge, &idx))
> > > return;
> > >
> > > /* These 3 lines will be replaced by devm_release_action() */
> > > ctx->disable_resources_needed = false;
> > > sn65dsi83_monitor_stop(ctx);
> > > regulator_disable(ctx->vcc);
> > >
> > > drm_bridge_exit(idx);
> > > }
> > >
> > > Here the xyz_disable() in my pseudocode is the sn65dsi83_monitor_stop()
> > > + regulator_disable().
> > >
> > > If sn65dsi83_remove() and sn65dsi83_atomic_disable() were to happen
> > > concurrently, this sequence of events could happen:
> > >
> > > 1. atomic_disable: drm_bridge_enter() -> OK, can go
> > > 2. remove: drm_bridge_remove()
> > > 3. remove: sn65dsi83_monitor_stop()
> > > 4. remove: regulator_disable()
> > > 5. remove: drm_bridge_unplug() -- too late to stop atomic_disable
> >
> > drm_dev_unplug would also get delayed until drm_dev_exit is called,
> > mitigating your issue here.
>
> I don't think I got what you mean. With the above code the regulator
> would still be subject to an en/disable imbalance.
My point was that drm_bridge_remove wouldn't be allowed to execute until
after atomic_disable has called drm_bridge_exit. So we wouldn't have the
sequence of events you described. atomic_disable would disable the
bridge, and then drm_bridge_remove wouln't have anything to disable
anymore by the time it runs.
> However I realized the problem does not exist when using devres,
> because devres itself takes care of executing each release function only
> once, by means of a spinlock.
>
> I think using devres actually solves my concerns about removal during
> atomic[_post]_disable, but also for the atomic[_pre]_enable and other
> call paths. Also, I think it makes the question of which goes first
> (drm_bridge_unplug() or _remove()) way less relevant.
>
> The concern is probably still valid for drivers which don't use devres.
> However the concern is irrelevant until there is a need for a bridge to
> become hot-pluggable. At that point a driver needs to either move to
> devres or take other actions to avoid incurring in the same issue.
I disagree with that statement. We never considered !devres as outdated,
and thus we need to support both. Especially if it's about races we know
about in a code path we might never run.
> I'm going to send soon a v2 with my devres changes so we can continue
> this discussion on actual code.
>
> > > 6. atomic_disable: ctx->disable_resources_needed = false -- too late to stop .remove
> > > 7. atomic_disable: sn65dsi83_monitor_stop() -- twice, maybe no problem
> > > 8. atomic_disable: regulator_disable() -- Twice, en/disable imbalance!
> > >
> > > So there is an excess regulator disable, which is an error. I don't see
> > > how this can be avoided if the drm_bridge_unplug() is called after the
> > > regulator_disable().
> > >
> > > Let me know whether this clarifies the need to _unplug at the beginning
> > > of the .remove function.
> >
> > Another thing that just crossed my mind is why we don't call
> > atomic_disable when we're tearing down the bridge too. We're doing it
> > for the main DRM devices, it would make sense to me to disable the
> > encoder -> bridge -> connector (and possibly CRTC) chain if we remove a
> > bridge automatically.
>
> Uh, interesting idea.
>
> Do you mean something like:
>
> void drm_bridge_unplug(struct drm_bridge *bridge)
> {
> bridge->unplugged = true;
> synchronize_srcu(&drm_bridge_unplug_srcu);
>
> drm_bridge_remove(bridge); // as per discussion above
>
> drm_atomic_helper_shutdown(bridge->dev);
> }
>
> ?
>
> I'm not sure which is the right call to tear down the pipeline though.
No, the shutdown needs to happen before marking the bridge unplugged,
otherwise you'll never run the disable callbacks.
And we probably shouldn't disable the whole device, just everything from
the CRTC that feeds the bridge.
Maxime
Download attachment "signature.asc" of type "application/pgp-signature" (274 bytes)
Powered by blists - more mailing lists