lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 20 May 2021 21:58:52 +0200
From:   Daniel Vetter <daniel@...ll.ch>
To:     Stephen Boyd <swboyd@...omium.org>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
        dri-devel@...ts.freedesktop.org, freedreno@...ts.freedesktop.org,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Rob Clark <robdclark@...il.com>,
        Russell King <rmk+kernel@....linux.org.uk>,
        Saravana Kannan <saravanak@...gle.com>
Subject: Re: [PATCH 7/7] drm/msm: Migrate to aggregate driver

On Wed, May 19, 2021 at 05:25:19PM -0700, Stephen Boyd wrote:
> The device lists are poorly ordered when the component device code is
> used. This is because component_master_add_with_match() returns 0
> regardless of component devices calling component_add() first. It can
> really only fail if an allocation fails, in which case everything is
> going bad and we're out of memory. The driver that registers the
> aggregate driver, can succeed at probe and put the attached device on
> the DPM lists before any of the component devices are probed and put on
> the lists.
> 
> Within the component device framework this usually isn't that bad
> because the real driver work is done at bind time via
> component{,master}_ops::bind(). It becomes a problem when the driver
> core, or host driver, wants to operate on the component device outside
> of the bind/unbind functions, e.g. via 'remove' or 'shutdown'. The
> driver core doesn't understand the relationship between the host device
> and the component devices and could possibly try to operate on component
> devices when they're already removed from the system or shut down.
> 
> Normally, device links or probe defer would reorder the lists and put
> devices that depend on other devices in the lists at the correct
> location, but with component devices this doesn't happen because this
> information isn't expressed anywhere. Drivers simply succeed at
> registering their component or the aggregate driver with the component
> framework and wait for their bind() callback to be called once the other
> components are ready. In summary, the drivers that make up the aggregate
> driver can probe in any order.
> 
> This ordering problem becomes fairly obvious when shutting down the
> device with a DSI controller connected to a DSI bridge that is
> controlled via i2c. In this case, the msm display driver wants to tear
> down the display pipeline on shutdown via msm_pdev_shutdown() by calling
> drm_atomic_helper_shutdown(), and it can't do that unless the whole
> display chain is still probed and active in the system. When a display
> bridge is on i2c, the i2c device for the bridge will be created whenever
> the i2c controller probes, which could be before or after the msm
> display driver probes. If the i2c controller probes after the display
> driver, then the i2c controller will be shutdown before the display
> controller during system wide shutdown and thus i2c transactions will
> stop working before the display pipeline is shut down. This means we'll
> have the display bridge trying to access an i2c bus that's shut down
> because drm_atomic_helper_shutdown() is trying to disable the bridge
> after the bridge is off.
> 
> The solution is to make the aggregate driver into a real struct driver
> that is bound to a device when the other component devices have all
> probed. Now that the component driver code is a proper bus, we can
> simply register an aggregate driver with that bus via
> component_aggregate_register() and then attach the shutdown hook to that
> driver to be sure that the shutdown for the display pipeline is called
> before any of the component device driver shutdown hooks are called.
> 
> Cc: Daniel Vetter <daniel.vetter@...ll.ch>
> Cc: "Rafael J. Wysocki" <rafael@...nel.org>
> Cc: Rob Clark <robdclark@...il.com>
> Cc: Russell King <rmk+kernel@....linux.org.uk>
> Cc: Saravana Kannan <saravanak@...gle.com>
> Signed-off-by: Stephen Boyd <swboyd@...omium.org>
> ---
> 
> As stated in the cover letter, this isn't perfect but it still works. I
> get a warning from runtime PM that the parent device (e00000.mdss) is
> not runtime PM enabled but the child device (the aggregate device) is
> being enabled by the bus logic. I need to move around the place that the
> parent device is runtime PM enabled and probably keep it powered up
> during the entire time that the driver is probed until the aggregate
> driver probes.
> 
>  drivers/gpu/drm/msm/msm_drv.c | 47 +++++++++++++++++++----------------
>  1 file changed, 26 insertions(+), 21 deletions(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
> index e1104d2454e2..0c64e6a2ce25 100644
> --- a/drivers/gpu/drm/msm/msm_drv.c
> +++ b/drivers/gpu/drm/msm/msm_drv.c
> @@ -1265,19 +1265,35 @@ static int add_gpu_components(struct device *dev,
>  	return 0;
>  }
>  
> -static int msm_drm_bind(struct device *dev)
> +static int msm_drm_bind(struct aggregate_device *adev)
>  {
> -	return msm_drm_init(dev, &msm_driver);
> +	return msm_drm_init(adev->dev.parent, &msm_driver);
>  }
>  
> -static void msm_drm_unbind(struct device *dev)
> +static void msm_drm_unbind(struct aggregate_device *adev)
>  {
> -	msm_drm_uninit(dev);
> +	msm_drm_uninit(adev->dev.parent);
> +}
> +
> +static void msm_drm_shutdown(struct aggregate_device *adev)
> +{
> +	struct drm_device *drm = platform_get_drvdata(to_platform_device(adev->dev.parent));
> +	struct msm_drm_private *priv = drm ? drm->dev_private : NULL;
> +
> +	if (!priv || !priv->kms)
> +		return;
> +
> +	drm_atomic_helper_shutdown(drm);
>  }
>  
> -static const struct component_master_ops msm_drm_ops = {
> -	.bind = msm_drm_bind,
> -	.unbind = msm_drm_unbind,
> +static struct aggregate_driver msm_drm_aggregate_driver = {
> +	.probe = msm_drm_bind,
> +	.remove = msm_drm_unbind,
> +	.shutdown = msm_drm_shutdown,
> +	.driver = {
> +		.name	= "msm_drm",
> +		.owner	= THIS_MODULE,
> +	},
>  };
>  
>  /*
> @@ -1306,7 +1322,8 @@ static int msm_pdev_probe(struct platform_device *pdev)
>  	if (ret)
>  		goto fail;
>  
> -	ret = component_master_add_with_match(&pdev->dev, &msm_drm_ops, match);
> +	msm_drm_aggregate_driver.match = match;

This is a bit awkward design, because it means the driver struct can't be
made const, and it will blow up when you have multiple instance of the
same driver. I think the match should stay as part of the register
function call, and be stored in the aggregate_device struct somewhere.

Otherwise I think this looks really solid and fixes your issue properly.
Obviously needs careful review from Greg KH for the device model side of
things, and from Rafael Wysocki for pm side.

Bunch of thoughts from a very cursory reading:

- I think it'd be good if we pass the aggregate_device to all components
  when we bind them, plus the void * parameter just to make this less
  disruptive. Even more device model goodies.

- Maybe splatter a pile of sysfs links around so that this all becomes
  visible? Could be interesting for debugging ordering issues. Just an
  idea, feel free to entirely ignore.

- Needs solid kerneldoc for everything exposed to drivers and good
  overview DOC:

- Needs deprecation warnings in the kerneldoc for all the
  component_master_* and if feasible with a mechanical conversion,
  converting existing users. I'd like to not be stuck with the old model
  forever, plus this will give a pile more people to review this code
  here.

Anyway the name changes in probe and remove hooks below are already worth
this on their own imo. That's why I'd like to see them in all drivers.

Cheers, Daniel

> +	ret = component_aggregate_register(&pdev->dev, &msm_drm_aggregate_driver);
>  	if (ret)
>  		goto fail;
>  
> @@ -1319,23 +1336,12 @@ static int msm_pdev_probe(struct platform_device *pdev)
>  
>  static int msm_pdev_remove(struct platform_device *pdev)
>  {
> -	component_master_del(&pdev->dev, &msm_drm_ops);
> +	component_aggregate_unregister(&pdev->dev, &msm_drm_aggregate_driver);
>  	of_platform_depopulate(&pdev->dev);
>  
>  	return 0;
>  }
>  
> -static void msm_pdev_shutdown(struct platform_device *pdev)
> -{
> -	struct drm_device *drm = platform_get_drvdata(pdev);
> -	struct msm_drm_private *priv = drm ? drm->dev_private : NULL;
> -
> -	if (!priv || !priv->kms)
> -		return;
> -
> -	drm_atomic_helper_shutdown(drm);
> -}
> -
>  static const struct of_device_id dt_match[] = {
>  	{ .compatible = "qcom,mdp4", .data = (void *)KMS_MDP4 },
>  	{ .compatible = "qcom,mdss", .data = (void *)KMS_MDP5 },
> @@ -1351,7 +1357,6 @@ MODULE_DEVICE_TABLE(of, dt_match);
>  static struct platform_driver msm_platform_driver = {
>  	.probe      = msm_pdev_probe,
>  	.remove     = msm_pdev_remove,
> -	.shutdown   = msm_pdev_shutdown,
>  	.driver     = {
>  		.name   = "msm",
>  		.of_match_table = dt_match,
> -- 
> https://chromeos.dev
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ