linux-kernel - Re: [PATCH V2 2/3] arch_topology: Avoid use-after-free for scale_freq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210617030612.sksyutwjgwmozka5@vireshk-i7>
Date:   Thu, 17 Jun 2021 08:36:12 +0530
From:   Viresh Kumar <viresh.kumar@...aro.org>
To:     Ionela Voinescu <ionela.voinescu@....com>
Cc:     Rafael Wysocki <rjw@...ysocki.net>,
        Sudeep Holla <sudeep.holla@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>, linux-pm@...r.kernel.org,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Qian Cai <quic_qiancai@...cinc.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 2/3] arch_topology: Avoid use-after-free for
 scale_freq_data

On 16-06-21, 13:00, Ionela Voinescu wrote:
> I would agree if it wasn't for the fact that the driver provides the
> set_freq_scale() implementation that ends up using driver internal data
> which could have been freed by the driver's own .exit()/stop_cpu()
> callbacks. The API and the generic implementation has the responsibility
> of making sure of sane access to its own structures.

How do you see timer core or workqueue core then ? Why do they make
sure they don't end up calling user's function once we ask them not to
?

And also scheduler's own util update mechanism, the exact thing
happens there. Users (cpufreq governors) call
cpufreq_add_update_util_hook() and cpufreq_remove_update_util_hook()
to pass their own data structure "struct update_util_data", which has
ia callback within. This is what happens from scheduler's context in
those cases.

static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
{
	struct update_util_data *data;

	data = rcu_dereference_sched(*per_cpu_ptr(&cpufreq_update_util_data,
						  cpu_of(rq)));
	if (data)
		data->func(data, rq_clock(rq), flags);
}

Also note that some kind of synchronisation mechanism is indeed
required between topology_set_scale_freq_source() and
topology_clear_scale_freq_source(), there is no race there today,
sure, but this is an API which is made generic.

> Even if we would want to keep drivers from shooting themselves in the
> foot, I would prefer we postpone it until we have more users for this,
> before we add any synchronisation mechanisms to functionality called
> on the tick.

The rcu mechanism is very much used in the scheduler itself because it
is lightweight. Honestly I don't even see any other way (w.r.t.
locking) users can fix it at their end. They don't know which was the
last tick that used their callback.

> Let's see if there's a less invasive solution to fix CPPC for now, what
> do you think?

For me, this change is required in the API despite how CPPC ends up
using it. Else we are failing at defining the API itself IMHO.

-- 
viresh