lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMbhsRTdQLd5GRew9GMx6O287k3O17SKE18jvK2k7oyZpOrZfg@mail.gmail.com>
Date:	Sat, 23 Jul 2011 16:26:15 -0700
From:	Colin Cross <ccross@...roid.com>
To:	"Rafael J. Wysocki" <rjw@...k.pl>
Cc:	linux-pm@...ts.linux-foundation.org, linux-kernel@...r.kernel.org,
	Kevin Hilman <khilman@...com>, Nishanth Menon <nm@...com>,
	Alan Stern <stern@...land.harvard.edu>
Subject: Re: [PATCH] PM: runtime: add might_sleep to PM runtime functions

On Sat, Jul 23, 2011 at 3:57 PM, Rafael J. Wysocki <rjw@...k.pl> wrote:
> On Saturday, July 23, 2011, Colin Cross wrote:
>> The list of functions that can be called in atomic context is
>> non-intuitive (pm_runtime_put_sync can not, but
>> pm_runtime_put_sync_suspend can, if pm_runtime_irq_safe has
>> been called?).
>
> However, this behavior is documented.
>
> Also, if you have a clean use case for calling rpm_idle() with interrupts
> off, it can be modified to work in analogy with rpm_suspend() in that respect.

Yes, Kevin posted that patch in response to a bug that would never
have existed with this patch.  Even with Kevin's change, this patch
still detects drivers that are missing pm_runtime_irq_safe().

>> The code is actively misleading - the entry
>> points all start with spin_lock_irqsave, suggesting they
>> are safe to call in atomic context, but may later
>> enable interrupts.
>
> May I say it is this way for a reason?

I'll reword that

>> Add might_sleep_if to all the __pm_runtime_* entry points
>> to enforce correct usage.
>
> I'm not sure how this makes things better.

I spent hours tracking down a bug that was caused by
pm_runtime_put_sync enabling interrupts when entering idle, which was
causing the timer interrupt to be serviced before the cpu entered
idle, and the cpu to idle forever until a non-timer interrupt
occurred.  The bug would never have been introduced with this patch.
When I ran with this patch, it immediately caught 3 other cases of
incorrect usage in atomic context, any of which could cause deadlocks:
spin_lock_irqsave(driver lock)
pm_runtime_put_sync
spin_lock_irqsave(dev lock)
spin_unlock_irq(dev_lock) - enables interrupts
driver irq
spin_lock(driver lock)

One of the bugs was put_sync instead of put_sync_suspend, which would
not be a problem after Kevin's patch, but the other two were missing
pm_runtime_irq_safe.

Not every developer who calls a pm_runtime function is going to read
the documentation, and this patch will catch the common incorrect
usage the first time it is run.

I'll update this patch on top of Kevin's.

>> Also add pm_runtime_put_sync_autosuspend to the list of
>> functions that can be called in atomic context.
>
> OK
>
> In addition to that rpm_idle() is missing the __releases __acquires
> annotations.

Do you want that added to this patch?  It seems like that fits better
into Kevin's patch, or a third patch.

>> Signed-off-by: Colin Cross <ccross@...roid.com>
>> ---
>>  Documentation/power/runtime_pm.txt |    1 +
>>  drivers/base/power/runtime.c       |   16 ++++++++++++++--
>>  2 files changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/power/runtime_pm.txt b/Documentation/power/runtime_pm.txt
>> index b24875b..22852b3 100644
>> --- a/Documentation/power/runtime_pm.txt
>> +++ b/Documentation/power/runtime_pm.txt
>> @@ -469,6 +469,7 @@ pm_runtime_autosuspend()
>>  pm_runtime_resume()
>>  pm_runtime_get_sync()
>>  pm_runtime_put_sync_suspend()
>> +pm_runtime_put_sync_autosuspend()
>>
>>  5. Run-time PM Initialization, Device Probing and Removal
>>
>> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
>> index 0d4587b..fdc4567 100644
>> --- a/drivers/base/power/runtime.c
>> +++ b/drivers/base/power/runtime.c
>> @@ -732,6 +732,8 @@ int __pm_runtime_idle(struct device *dev, int rpmflags)
>>       unsigned long flags;
>>       int retval;
>>
>> +     might_sleep_if(!(rpmflags & RPM_ASYNC));
>> +
>>       if (rpmflags & RPM_GET_PUT) {
>>               if (!atomic_dec_and_test(&dev->power.usage_count))
>>                       return 0;
>> @@ -754,13 +756,16 @@ EXPORT_SYMBOL_GPL(__pm_runtime_idle);
>>   * return immediately if it is larger than zero.  Then carry out a suspend,
>>   * either synchronous or asynchronous.
>>   *
>> - * This routine may be called in atomic context if the RPM_ASYNC flag is set.
>> + * This routine may be called in atomic context if the RPM_ASYNC flag is set,
>> + * or if pm_runtime_irq_safe() has been called.
>>   */
>>  int __pm_runtime_suspend(struct device *dev, int rpmflags)
>>  {
>>       unsigned long flags;
>>       int retval;
>>
>> +     might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
>> +
>>       if (rpmflags & RPM_GET_PUT) {
>>               if (!atomic_dec_and_test(&dev->power.usage_count))
>>                       return 0;
>> @@ -782,13 +787,16 @@ EXPORT_SYMBOL_GPL(__pm_runtime_suspend);
>>   * If the RPM_GET_PUT flag is set, increment the device's usage count.  Then
>>   * carry out a resume, either synchronous or asynchronous.
>>   *
>> - * This routine may be called in atomic context if the RPM_ASYNC flag is set.
>> + * This routine may be called in atomic context if the RPM_ASYNC flag is set,
>> + * or if pm_runtime_irq_safe() has been called.
>>   */
>>  int __pm_runtime_resume(struct device *dev, int rpmflags)
>>  {
>>       unsigned long flags;
>>       int retval;
>>
>> +     might_sleep_if(!(rpmflags & RPM_ASYNC) && !dev->power.irq_safe);
>> +
>>       if (rpmflags & RPM_GET_PUT)
>>               atomic_inc(&dev->power.usage_count);
>>
>> @@ -978,6 +986,8 @@ EXPORT_SYMBOL_GPL(pm_runtime_barrier);
>>   */
>>  void __pm_runtime_disable(struct device *dev, bool check_resume)
>>  {
>> +     might_sleep();
>> +
>
> Well, it's not correct to call spin_lock_irq() from interrupt context anyway.
>
>>       spin_lock_irq(&dev->power.lock);
>>
>>       if (dev->power.disable_depth > 0) {
>> @@ -1184,6 +1194,8 @@ void __pm_runtime_use_autosuspend(struct device *dev, bool use)
>>  {
>>       int old_delay, old_use;
>>
>> +     might_sleep();
>> +
>>       spin_lock_irq(&dev->power.lock);
>>       old_delay = dev->power.autosuspend_delay;
>>       old_use = dev->power.use_autosuspend;
>>
>
> Thanks,
> Rafael
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ