lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0gcOmd8fXG9_BYxr6rN7ncUWnfki7K9S5wK2Vvh9SxUCA@mail.gmail.com>
Date:   Wed, 11 May 2022 16:15:00 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Schspa Shi <schspa@...il.com>
Cc:     Viresh Kumar <viresh.kumar@...aro.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>
Subject: Re: [PATCH v3] cpufreq: fix race on cpufreq online

On Tue, May 10, 2022 at 5:42 PM Schspa Shi <schspa@...il.com> wrote:
>
> When cpufreq online failed, policy->cpus are not empty while
> cpufreq sysfs file available, we may access some data freed.
>
> Take policy->clk as an example:
>
> static int cpufreq_online(unsigned int cpu)
> {
>   ...
>   // policy->cpus != 0 at this time
>   down_write(&policy->rwsem);
>   ret = cpufreq_add_dev_interface(policy);
>   up_write(&policy->rwsem);
>
>   down_write(&policy->rwsem);
>   ...
>   /* cpufreq nitialization fails in some cases */
>   if (cpufreq_driver->get && has_target()) {
>     policy->cur = cpufreq_driver->get(policy->cpu);
>     if (!policy->cur) {
>       ret = -EIO;
>       pr_err("%s: ->get() failed\n", __func__);
>       goto out_destroy_policy;
>     }
>   }
>   ...
>   up_write(&policy->rwsem);
>   ...
>
>   return 0;
>
> out_destroy_policy:
>         for_each_cpu(j, policy->real_cpus)
>                 remove_cpu_dev_symlink(policy, get_cpu_device(j));
>     up_write(&policy->rwsem);
> ...
> out_exit_policy:
>   if (cpufreq_driver->exit)
>     cpufreq_driver->exit(policy);
>       clk_put(policy->clk);
>       // policy->clk is a wild pointer
> ...
>                                     ^
>                                     |
>                             Another process access
>                             __cpufreq_get
>                               cpufreq_verify_current_freq
>                                 cpufreq_generic_get
>                                   // acces wild pointer of policy->clk;
>                                     |
>                                     |
> out_offline_policy:                 |
>   cpufreq_policy_free(policy);      |
>     // deleted here, and will wait for no body reference
>     cpufreq_policy_put_kobj(policy);
> }
>
> We can fix it by clear the policy->cpus mask.
> Both show_scaling_cur_freq and show_cpuinfo_cur_freq will return an
> error by checking this mask, thus avoiding UAF.

So the UAF only happens if something is freed by ->offline() or
->exit() and I'm not sure where the mask is checked in the
scaling_cur_freq() path.

Overall, the patch is really two changes in one IMO.

> Signed-off-by: Schspa Shi <schspa@...il.com>
>
> ---
>
> Changelog:
> v1 -> v2:
>         - Fix bad critical region enlarge which causes uninitialized
>           unlock.
> v2 -> v3:
>         - Remove the missed down_write() before
>           cpumask_and(policy->cpus, policy->cpus, cpu_online_mask);
>
> Signed-off-by: Schspa Shi <schspa@...il.com>
> ---
>  drivers/cpufreq/cpufreq.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 80f535cc8a75..d93958dbdab8 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -1337,12 +1337,12 @@ static int cpufreq_online(unsigned int cpu)
>                 down_write(&policy->rwsem);
>                 policy->cpu = cpu;
>                 policy->governor = NULL;
> -               up_write(&policy->rwsem);
>         } else {
>                 new_policy = true;
>                 policy = cpufreq_policy_alloc(cpu);
>                 if (!policy)
>                         return -ENOMEM;
> +               down_write(&policy->rwsem);
>         }
>
>         if (!new_policy && cpufreq_driver->online) {
> @@ -1382,7 +1382,6 @@ static int cpufreq_online(unsigned int cpu)
>                 cpumask_copy(policy->related_cpus, policy->cpus);
>         }
>
> -       down_write(&policy->rwsem);
>         /*
>          * affected cpus must always be the one, which are online. We aren't
>          * managing offline cpus here.

The first change, which could and probably should be a separate patch,
ends here.

You prevent the rwsem from being dropped in the existing policy case
and acquire it right after creating a new policy.

This way ->online() always runs under the rwsem, which definitely
sounds like a good idea, and policy->cpus is manipulated under the
rwsem which IMV is required.

As a side-effect, ->init() is also run under the rwsem, but that
shouldn't be a problem as per your discussion with Viresh.

So the above would be patch 1 in a series.

The change below is a separate one and it addresses the particular
race you've discovered, as long as patch 1 above is present.  It would
be patch 2 in the series.

> @@ -1533,7 +1532,7 @@ static int cpufreq_online(unsigned int cpu)
>         for_each_cpu(j, policy->real_cpus)
>                 remove_cpu_dev_symlink(policy, get_cpu_device(j));
>
> -       up_write(&policy->rwsem);
> +       cpumask_clear(policy->cpus);

It is OK to clear policy->cpus here, because ->offline() and ->exit()
are called with policy->cpus clear from cpufreq_offline() and
cpufreq_remove_dev(), so they cannot assume policy->cpus to be
populated when they are invoked.  However, this needs to be stated in
the changelog of patch 2.

>  out_offline_policy:
>         if (cpufreq_driver->offline)
> @@ -1542,6 +1541,7 @@ static int cpufreq_online(unsigned int cpu)
>  out_exit_policy:
>         if (cpufreq_driver->exit)
>                 cpufreq_driver->exit(policy);
> +       up_write(&policy->rwsem);

It is consistent to run ->offline() and ->exit() under the rwsem, so
this change is OK too.

>  out_free_policy:
>         cpufreq_policy_free(policy);
> --

That said, there still are races that are not addressed by the above,
so I would add patch 3 changing show() to check policy_is_inactive()
under the rwsem.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ