[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ffc7e137-8c0a-2151-cfb6-b41ae53cadf2@intel.com>
Date: Thu, 16 Dec 2021 19:24:13 +0100
From: "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>
To: Thomas Hellström
<thomas.hellstrom@...ux.intel.com>
CC: LKML <linux-kernel@...r.kernel.org>,
Intel Graphics Development <intel-gfx@...ts.freedesktop.org>
Subject: Re: [PATCH] PM: sleep: Avoid calling put_device() under dpm_list_mtx
On 12/16/2021 2:27 PM, Thomas Hellström wrote:
> Hi, Rafael,
>
> On 11/4/21 18:26, Rafael J. Wysocki wrote:
>> It is generally unsafe to call put_device() with dpm_list_mtx held,
>> because the given device's release routine may carry out an action
>> depending on that lock which then may deadlock, so modify the
>> system-wide suspend and resume of devices to always drop dpm_list_mtx
>> before calling put_device() (and adjust white space somewhat while
>> at it).
>>
>> For instance, this prevents the following splat from showing up in
>> the kernel log after a system resume in certain configurations:
>
>
> <snip>
>
>
>> @@ -1748,21 +1769,27 @@ int dpm_suspend(pm_message_t state)
>> struct device *dev = to_device(dpm_prepared_list.prev);
>> get_device(dev);
>> +
>> mutex_unlock(&dpm_list_mtx);
>> error = device_suspend(dev);
>> mutex_lock(&dpm_list_mtx);
>> +
>> if (error) {
>> pm_dev_err(dev, state, "", error);
>> dpm_save_failed_dev(dev_name(dev));
>> - put_device(dev);
>> - break;
>> - }
>> - if (!list_empty(&dev->power.entry))
>> + } else if (!list_empty(&dev->power.entry)) {
>> list_move(&dev->power.entry, &dpm_suspended_list);
>> + }
>> +
>> + mutex_unlock(&dpm_list_mtx);
>> +
>> put_device(dev);
>> - if (async_error)
>> +
>> + mutex_lock(&dpm_list_mtx);
>> +
>> + if (error || async_error)
>> break;
>> }
>> mutex_unlock(&dpm_list_mtx);
>> @@ -1879,6 +1906,7 @@ int dpm_prepare(pm_message_t state)
>> struct device *dev = to_device(dpm_list.next);
>> get_device(dev);
>> +
>> mutex_unlock(&dpm_list_mtx);
>> trace_device_pm_callback_start(dev, "", state.event);
>> @@ -1886,21 +1914,23 @@ int dpm_prepare(pm_message_t state)
>> trace_device_pm_callback_end(dev, error);
>> mutex_lock(&dpm_list_mtx);
>> - if (error) {
>> - if (error == -EAGAIN) {
>> - put_device(dev);
>> - error = 0;
>> - continue;
>> - }
>> +
>> + if (!error) {
>> + dev->power.is_prepared = true;
>> + if (!list_empty(&dev->power.entry))
>> + list_move_tail(&dev->power.entry, &dpm_prepared_list);
>> + } else if (error == -EAGAIN) {
>> + error = 0;
>> + } else {
>> dev_info(dev, "not prepared for power transition: code
>> %d\n",
>> error);
>> - put_device(dev);
>> - break;
>
> It appears the above break disappeared.
>
>
>> }
>> - dev->power.is_prepared = true;
>> - if (!list_empty(&dev->power.entry))
>> - list_move_tail(&dev->power.entry, &dpm_prepared_list);
>> +
>> + mutex_unlock(&dpm_list_mtx);
>> +
>> put_device(dev);
>
> Should be
>
> if (error)
>
> break;
>
> Here?
>
Looks like that.
> Symptoms is if we return an error from the device prepare callback, we
> end up spinning forever with little clue what's going on.
>
>
>> +
>> + mutex_lock(&dpm_list_mtx);
>> }
>> mutex_unlock(&dpm_list_mtx);
>> trace_suspend_resume(TPS("dpm_prepare"), state.event, false);
>
I'll have a look, thanks for the report!
Powered by blists - more mailing lists