lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMA88TpJuJY7oxPFW8xKzch60+n_2qRp7KO2r-YwZxiM7GXF3w@mail.gmail.com>
Date:   Wed, 11 May 2022 21:42:40 +0800
From:   Schspa Shi <schspa@...il.com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     Viresh Kumar <viresh.kumar@...aro.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>
Subject: Re: [PATCH v3] cpufreq: fix race on cpufreq online

"Rafael J. Wysocki" <rafael@...nel.org> writes:

> On Wed, May 11, 2022 at 2:59 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>>
>> On Wed, May 11, 2022 at 2:21 PM Viresh Kumar <viresh.kumar@...aro.org> wrote:
>> >
>> > On 11-05-22, 16:10, Schspa Shi wrote:
>> > > Viresh Kumar <viresh.kumar@...aro.org> writes:
>> > > > I am not sure, but maybe there were issues in calling init() with rwsem held, as
>> > > > it may want to call some API from there.
>> > > >
>> > >
>> > > I have checked all the init() implement of the fellowing files, It should be OK.
>> > > Function find command:
>> > >   ag "init[\s]+=" drivers/cpufreq
>> > >
>> > > All the init() implement only initialize policy object without holding this lock
>> > > and won't call cpufreq APIs need to hold this lock.
>> >
>> > Okay, we can see if someone complains later then :)
>> >
>> > > > I don't think you can do that safely. offline() or exit() may depend on
>> > > > policy->cpus being set to all CPUs.
>> > > OK, I will move this after exit(). and there will be no effect with those
>> > > two APIs. But policy->cpus must be clear before release policy->rwsem.
>> >
>> > Hmm, I don't think depending on the values of policy->cpus is a good idea to be
>> > honest. This design is inviting bugs to come in at another place. We need a
>> > clear flag for this, a new flag or something like policy_list.
>
> Why?
>
>> > Also I see the same bug happening while the policy is removed. The kobject is
>> > put after the rwsem is dropped.
>
> This shouldn't be a problem because of the wait_for_completion() in
> cpufreq_policy_put_kobj().  It is known that cpufreq_sysfs_release()
> has run when cpufreq_policy_put_kobj() returns, so it is safe to free
> the policy then.
>
>> > > >  static inline bool policy_is_inactive(struct cpufreq_policy *policy)
>> > > >  {
>> > > > -     return cpumask_empty(policy->cpus);
>> > > > +     return unlikely(cpumask_empty(policy->cpus) ||
>> > > > +                     list_empty(&policy->policy_list));
>> > > >  }
>> > > >
>> > >
>> > > I don't think this fully solves my problem.
>> > > 1. There is some case which cpufreq_online failed after the policy is added to
>> > >    cpufreq_policy_list.
>> >
>> > And I missed that :(
>> >
>> > > 2. policy->policy_list is not protected by &policy->rwsem, and we
>> > > can't relay on this to
>> > >    indict the policy is fine.
>> >
>> > Ahh..
>> >
>> > > >From this point of view, we can fix this problem through the state of
>> > > this linked list.
>> > > But the above two problems need to be solved first.
>> >
>> > I feel overriding policy_list for this is going to make it complex/messy.
>> >
>> > Maybe something like this then:
>>
>> There are two things.
>>
>> One is the possible race with respect to the sysfs access occurring
>> during failing initialization and the other is that ->offline() or
>> ->exit() can be called with or without holding the policy rwsem
>> depending on the code path.
>>
>> Namely, cpufreq_offline() calls them under the policy rwsem, but
>> cpufreq_remove_dev() calls ->exit() outside the rwsem.  Also they are
>> called outside the rwsem in cpufreq_online().
>>
>> Moreover, ->offline() and ->exit() cannot expect policy->cpus to be
>> populated, because they are called when it is empty from
>> cpufreq_offline().
>>
>> So the $subject patch is correct AFAICS even though it doesn't address
>> all of the above.
>
> TBH, I'm not sure why show() doesn't check policy_is_inactive() under the rwsem.
>

There is a exist bugs, and somebody try to fixed, please see commit
Fixes: 2f66196208c9 ("cpufreq: check if policy is inactive early in
__cpufreq_get()")

> Moreover, I'm not sure why the locking dance in store() is necessary.

The store interface hold cpu_hotplug_lock via
    cpus_read_trylock();
, cannot run in parallel with cpufreq_online() & cpufreq_offline().

---
BRs

Schspa Shi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ