[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <usrnztprl7rjnm2t5bo72k6dsgfpy6zw6ezd33akua5zc3qsfu@iafn2r2mlcjo>
Date: Mon, 4 Sep 2023 17:18:02 +0200
From: Maxime Ripard <mripard@...nel.org>
To: "Russell King (Oracle)" <linux@...linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Douglas Anderson <dianders@...omium.org>,
dri-devel@...ts.freedesktop.org, airlied@...il.com,
daniel@...ll.ch, linux-kernel@...r.kernel.org
Subject: Re: [RFT PATCH 01/15] drm/armada: Call drm_atomic_helper_shutdown()
at shutdown time
On Mon, Sep 04, 2023 at 08:55:43AM +0100, Russell King (Oracle) wrote:
> On Mon, Sep 04, 2023 at 09:36:10AM +0200, Maxime Ripard wrote:
> > On Sun, Sep 03, 2023 at 04:53:42PM +0100, Russell King (Oracle) wrote:
> > > On Fri, Sep 01, 2023 at 04:41:12PM -0700, Douglas Anderson wrote:
> > > > Based on grepping through the source code this driver appears to be
> > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > time. Among other things, this means that if a panel is in use that it
> > > > won't be cleanly powered off at system shutdown time.
> > > >
> > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > instance overview" in drm_drv.c.
> > > >
> > > > This driver was fairly easy to update. The drm_device is stored in the
> > > > drvdata so we just have to make sure the drvdata is NULL whenever the
> > > > device is not bound.
> > >
> > > ... and there I think you have a misunderstanding of the driver model.
> > > Please have a look at device_unbind_cleanup() which will be called if
> > > probe fails, or when the device is removed (in other words, when it is
> > > not bound to a driver.)
> > >
> > > Also, devices which aren't bound to a driver won't have their shutdown
> > > method called (because there is no driver currently bound to that
> > > device.) So, ->probe must have completed successfully, and ->remove
> > > must not have been called for that device.
> > >
> > > So, I think that all these dev_set_drvdata(dev, NULL) that you're
> > > adding are just asking for a kernel janitor to come along later and
> > > remove them because they serve no purpose... so best not introduce
> > > them in the first place.
> >
> > What would that hypothetical janitor clean up exactly? Code making sure
> > that there's no dangling pointer? Doesn't look very wise to me.
>
> How can there be a dangling pointer when the driver core removes the
> pointer for the driver in these cases?
You can still access that pointer from remove after the call to
component_del(). It's unlikely, sure, but the issue is still there.
> If I were to accept the argument that the driver core _might_ "forget"
> to NULL out this pointer, then that argument by extension also means
> that no one should make use of the devm_* stuff either, just in case
> the driver core forgets to release that stuff. Best have every driver
> manually release those resources.
It's funny that you go for that argument, because using devm is known to
be a design issue in KMS (and the rest of the kernel to some extent), so
yeah, I very much agree with you there.
> Nope, that doesn't work, because driver authors tend to write buggy
> cleanup paths.
And using devm is just as buggy for a KMS driver. We even discourage its
use in the documentation.
But really, I'm not sure what your point is there. Does devm lead to
bugs? Sure. Is it less buggy that manually unrolling an exit path by
hand? Probably. I seriously don't get the relation to some code clearing
a pointer after it's been freed though.
> There are janitors that go around removing this stuff, and janitorial
> patches tend to keep coming even if one says nak at any particular
> point... and yes, janitors do go around removing this unnecessary
> junk from drivers.
>
> You will find examples of this removal in commit
> ec3b1ce2ca34, 5cdade2d77dd, c7cb175bb1ef
Other people doing it doesn't make it right (or wrong). And really, I'm
not arguing that it's right, I'm saying that it's not wrong.
It's probably being over-cautious, especially in that driver, but it's
not wrong.
> Moreover:
>
> 7efb10383181
>
> is also removing unnecessary driver code. Testing for driver data being
> NULL when we know that a _successful_ probe has happened (because
> ->remove won't be called unless we were successful) and the probe
> always sets drvdata non-NULL is also useless code.
Again, I fail to see what the relationship is there.
> If ultimately you don't trust the driver model to do what it's been
> doing for more than the last decade, then I wonder whether you should
> be trusting the kernel to manage your hardware!
It's not the kernel driver model that I don't trust, it's C's (lack of)
memory safety and management. And like you said yourself, "driver
authors tend to write buggy"
> Anyway, I've said no to this patch for a driver that I'm marked as
> maintainer for, so at least do not make the changes I am objecting to
> to that driver. Thanks.
You're entitled to that opinion indeed.
Maxime
Powered by blists - more mailing lists