lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0id+7vkqMQEyVRe29oF_dRtzZ0EhoYUn8=yzeENDeABJw@mail.gmail.com>
Date:   Wed, 11 May 2022 15:19:56 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Viresh Kumar <viresh.kumar@...aro.org>
Cc:     Schspa Shi <schspa@...il.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: [PATCH v3] cpufreq: fix race on cpufreq online

On Wed, May 11, 2022 at 2:59 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Wed, May 11, 2022 at 2:21 PM Viresh Kumar <viresh.kumar@...aro.org> wrote:
> >
> > On 11-05-22, 16:10, Schspa Shi wrote:
> > > Viresh Kumar <viresh.kumar@...aro.org> writes:
> > > > I am not sure, but maybe there were issues in calling init() with rwsem held, as
> > > > it may want to call some API from there.
> > > >
> > >
> > > I have checked all the init() implement of the fellowing files, It should be OK.
> > > Function find command:
> > >   ag "init[\s]+=" drivers/cpufreq
> > >
> > > All the init() implement only initialize policy object without holding this lock
> > > and won't call cpufreq APIs need to hold this lock.
> >
> > Okay, we can see if someone complains later then :)
> >
> > > > I don't think you can do that safely. offline() or exit() may depend on
> > > > policy->cpus being set to all CPUs.
> > > OK, I will move this after exit(). and there will be no effect with those
> > > two APIs. But policy->cpus must be clear before release policy->rwsem.
> >
> > Hmm, I don't think depending on the values of policy->cpus is a good idea to be
> > honest. This design is inviting bugs to come in at another place. We need a
> > clear flag for this, a new flag or something like policy_list.

Why?

> > Also I see the same bug happening while the policy is removed. The kobject is
> > put after the rwsem is dropped.

This shouldn't be a problem because of the wait_for_completion() in
cpufreq_policy_put_kobj().  It is known that cpufreq_sysfs_release()
has run when cpufreq_policy_put_kobj() returns, so it is safe to free
the policy then.

> > > >  static inline bool policy_is_inactive(struct cpufreq_policy *policy)
> > > >  {
> > > > -     return cpumask_empty(policy->cpus);
> > > > +     return unlikely(cpumask_empty(policy->cpus) ||
> > > > +                     list_empty(&policy->policy_list));
> > > >  }
> > > >
> > >
> > > I don't think this fully solves my problem.
> > > 1. There is some case which cpufreq_online failed after the policy is added to
> > >    cpufreq_policy_list.
> >
> > And I missed that :(
> >
> > > 2. policy->policy_list is not protected by &policy->rwsem, and we
> > > can't relay on this to
> > >    indict the policy is fine.
> >
> > Ahh..
> >
> > > >From this point of view, we can fix this problem through the state of
> > > this linked list.
> > > But the above two problems need to be solved first.
> >
> > I feel overriding policy_list for this is going to make it complex/messy.
> >
> > Maybe something like this then:
>
> There are two things.
>
> One is the possible race with respect to the sysfs access occurring
> during failing initialization and the other is that ->offline() or
> ->exit() can be called with or without holding the policy rwsem
> depending on the code path.
>
> Namely, cpufreq_offline() calls them under the policy rwsem, but
> cpufreq_remove_dev() calls ->exit() outside the rwsem.  Also they are
> called outside the rwsem in cpufreq_online().
>
> Moreover, ->offline() and ->exit() cannot expect policy->cpus to be
> populated, because they are called when it is empty from
> cpufreq_offline().
>
> So the $subject patch is correct AFAICS even though it doesn't address
> all of the above.

TBH, I'm not sure why show() doesn't check policy_is_inactive() under the rwsem.

Moreover, I'm not sure why the locking dance in store() is necessary.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ