linux-kernel - Re: [PATCH V5 1/2] base/drivers/arch_topology: Replace mutex with READ_ONCE / WRITE

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181129070437.GD4271@localhost.localdomain>
Date:   Thu, 29 Nov 2018 08:04:37 +0100
From:   Juri Lelli <juri.lelli@...hat.com>
To:     Daniel Lezcano <daniel.lezcano@...aro.org>
Cc:     rjw@...ysocki.net, linux-kernel@...r.kernel.org,
        viresh.kumar@...aro.org, Sudeep Holla <sudeep.holla@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        Morten Rasmussen <morten.rasmussen@....com>
Subject: Re: [PATCH V5 1/2] base/drivers/arch_topology: Replace mutex with
 READ_ONCE / WRITE_ONCE

On 28/11/18 18:54, Daniel Lezcano wrote:
> On 28/11/2018 12:44, Juri Lelli wrote:
> > Hi Daniel,
> > 
> > On 27/11/18 14:24, Daniel Lezcano wrote:
> >> The mutex protects a per_cpu variable access. The potential race can
> >> happen only when the cpufreq governor module is loaded and at the same
> >> time the cpu capacity is changed in the sysfs.
> >>
> >> There is no real interest of using a mutex to protect a variable
> >> assignation when there is no situation where a task can take the lock
> >> and block.
> >>
> >> Replace the mutex by READ_ONCE / WRITE_ONCE.
> >>
> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@...aro.org>
> >> Cc: Sudeep Holla <sudeep.holla@....com>
> >> Reviewed-by: Viresh Kumar <viresh.kumar@...aro.org>
> >> ---
> >>  drivers/base/arch_topology.c  | 7 +------
> >>  include/linux/arch_topology.h | 2 +-
> >>  2 files changed, 2 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> >> index edfcf8d..fd5325b 100644
> >> --- a/drivers/base/arch_topology.c
> >> +++ b/drivers/base/arch_topology.c
> >> @@ -31,12 +31,11 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
> >>  		per_cpu(freq_scale, i) = scale;
> >>  }
> >>  
> >> -static DEFINE_MUTEX(cpu_scale_mutex);
> >>  DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
> >>  
> >>  void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)
> >>  {
> >> -	per_cpu(cpu_scale, cpu) = capacity;
> >> +	WRITE_ONCE(per_cpu(cpu_scale, cpu), capacity);
> >>  }
> >>  
> >>  static ssize_t cpu_capacity_show(struct device *dev,
> >> @@ -71,10 +70,8 @@ static ssize_t cpu_capacity_store(struct device *dev,
> >>  	if (new_capacity > SCHED_CAPACITY_SCALE)
> >>  		return -EINVAL;
> >>  
> >> -	mutex_lock(&cpu_scale_mutex);
> >>  	for_each_cpu(i, &cpu_topology[this_cpu].core_sibling)
> >>  		topology_set_cpu_scale(i, new_capacity);
> >> -	mutex_unlock(&cpu_scale_mutex);
> > 
> > IIRC this was meant to ensure atomic updates of all siblings with the new
> > capacity value. I actually now wonder if readers should not grab the
> > mutex as well (cpu_capacity_show()). Can't we get into a situation where
> > a reader might see siblings with intermediate values (while the loop
> > above is performing an update)?
> 
> With or without this patch, it is the case:
> 
>                 task1                      task2
>                   |                          |
>   read("/sys/.../cpu1/cpu_capacity)          |
>                   |                  write("/sys/.../cpu1/cpu_capacity")
>   read("/sys/.../cpu2/cpu_capacity)          |
> 
> 
> There is no guarantee userspace can have a consistent view of the
> capacity. As soon as it reads a capacity, it can be changed in its back.

True, but w/o the mutex task1 could read different cpu_capacity values
for a cluster (it actually can also with current implementation, we
should grab the mutex in the read path as well if we want to avoid
this). Just pointing this out, not sure it is a problem, though, as even
the notion of strictly equal capacities clusters is going to disappear
AFAIK.