linux-kernel - Re: [PATCH v2 5/5] cpufreq: Catch double invocations of cpufreq_freq_transition

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 29 Apr 2014 11:38:24 +0530
From:	"Srivatsa S. Bhat" <srivatsa.bhat@...ux.vnet.ibm.com>
To:	Viresh Kumar <viresh.kumar@...aro.org>
CC:	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	Meelis Roos <mroos@...ux.ee>,
	"cpufreq@...r.kernel.org" <cpufreq@...r.kernel.org>,
	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 5/5] cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end

On 04/29/2014 10:21 AM, Viresh Kumar wrote:
> Nice effort.
>

Thanks! :-)
 
> On 29 April 2014 00:25, Srivatsa S. Bhat
> <srivatsa.bhat@...ux.vnet.ibm.com> wrote:
>> Now all such drivers have been fixed, but debugging this issue was not
>> very straight-forward (even lockdep didn't catch this). So let us add a
>> debug infrastructure to the cpufreq core to catch such issues more easily
>> in the future.
> 
> BUT, I am not sure if we really need it :(
> 
> I think we just got into the 'barrier' stuff as we had some doubts about it
> earlier and were quite sure that nothing else could go wrong. Otherwise
> the only problem could have been present was the second queuing
> from the same thread. And we will surely get stuck if that happens and
> we can't just miss that error..
> 

Yeah, and we _did_ hit that hang, but it was not at all intuitive at first
as to what was going wrong. Worse, even lockdep is not in a position to catch
such scenarios. So it definitely doesn't hurt to add a small infrastructure
to catch such issues in the future, IMHO.

Besides, if we can add features for users, surely we can also add some
non-intrusive debug code for ourselves too, to make our lives easier, right? :-)
I'm sure we deserve that privilege ;-)

Regards,
Srivatsa S. Bhat

>> Scenario 1: (Deadlock-free)
>> ----------
>>
>>          Task A                                         Task B
>>
>>     /* 1st freq transition */
>>     Invoke _begin() {
>>             ...
>>             ...
>>     }
>>
>>     Change the frequency
>>
>>                                                 Got interrupt for successful
>>                                                 change of frequency.
>>
>>                                                 /* 1st freq transition */
>>                                                 Invoke _end() {
>>                                                         ...
>>                                                         ...
>>     /* 2nd freq transition */                           ...
>>     Invoke _begin() {                                   ...
>>             ... //waiting for B                         ...
>>             ... //to finish _end()              }
>>             ...
>>             ...
>>     }
>>
>>
>> This scenario is actually deadlock-free because Task A can wait inside the
>> second call to _begin() without self-deadlocking, because it is the
>> responsibility of Task B to finish the first sequence by invoking the
>> corresponding _end().
>>
>> By setting the value of 'transition_task' again explicitly in _end(), we
>> ensure that the code won't print a false-positive warning in this case.
>>
>> However the same code successfully catches the following deadlock-prone
>> scenario even in ASYNC_NOTIFICATION drivers:
>>
>> Scenario 2: (Deadlock-prone)
>> ----------
>>
>>          Task A                                         Task B
>>
>>     /* 1st freq transition */
>>     Invoke _begin() {
>>             ...
>>             ...
>>     }
>>
>>     /* 2nd freq transition */
>>     Invoke _begin() {
>>             ...
>>             ...
>>     }
>>
>>     Change the frequency
>>
>>
>> Here the bug is that Task A called the second _begin() *before* actually
>> performing the 1st frequency transition. In other words, it failed to set
>> Task B in motion for the 1st frequency transition, and hence it will
>> self-deadlock. This is very similar to the case of drivers which do
>> synchronous notification, and hence the debug infrastructure developed
>> in this patch can catch this scenario easily.
>>
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@...ux.vnet.ibm.com>
>> ---
>>
>>  drivers/cpufreq/cpufreq.c |   12 ++++++++++++
>>  include/linux/cpufreq.h   |    1 +
>>  2 files changed, 13 insertions(+)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index abda660..2c99a6c 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -354,6 +354,10 @@ static void cpufreq_notify_post_transition(struct cpufreq_policy *policy,
>>  void cpufreq_freq_transition_begin(struct cpufreq_policy *policy,
>>                 struct cpufreq_freqs *freqs)
>>  {
>> +
>> +       /* Catch double invocations of _begin() which lead to self-deadlock */
>> +       WARN_ON(current == policy->transition_task);
>> +
>>  wait:
>>         wait_event(policy->transition_wait, !policy->transition_ongoing);
>>
>> @@ -365,6 +369,7 @@ wait:
>>         }
>>
>>         policy->transition_ongoing = true;
>> +       policy->transition_task = current;
>>
>>         spin_unlock(&policy->transition_lock);
>>
>> @@ -378,9 +383,16 @@ void cpufreq_freq_transition_end(struct cpufreq_policy *policy,
>>         if (unlikely(WARN_ON(!policy->transition_ongoing)))
>>                 return;
>>
>> +       /*
>> +        * The task invoking _end() could be different from the one that
>> +        * invoked the _begin(). So set ->transition_task again here
>> +        * explicity.
>> +        */
>> +       policy->transition_task = current;
>>         cpufreq_notify_post_transition(policy, freqs, transition_failed);
>>
>>         policy->transition_ongoing = false;
>> +       policy->transition_task = NULL;
>>
>>         wake_up(&policy->transition_wait);
>>  }
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index 5ae5100..8f44d79 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -110,6 +110,7 @@ struct cpufreq_policy {
>>         bool                    transition_ongoing; /* Tracks transition status */
>>         spinlock_t              transition_lock;
>>         wait_queue_head_t       transition_wait;
>> +       struct task_struct      *transition_task; /* Task which is doing the transition */
>>  };
>>
>>  /* Only for ACPI */
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/